Hugging Face Model & Dataset Trend Tracker avatar

Hugging Face Model & Dataset Trend Tracker

Pricing

from $1.00 / 1,000 results

Go to Apify Store
Hugging Face Model & Dataset Trend Tracker

Hugging Face Model & Dataset Trend Tracker

Track trending Hugging Face models and datasets by downloads, likes, and velocity. Filter by task, library, or tag. Monitor mode alerts you to newly trending entries. Built for ML engineers, DevRel, and AI researchers.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

DSH

DSH

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

3 days ago

Last modified

Share

Track trending Hugging Face models and datasetsdownloads, likes, and velocity — straight from the official public Hugging Face Hub API. No browser, no scraping tricks, no API key required. Filter by task (text-generation, text-to-image, automatic-speech-recognition…), library (transformers, diffusers, gguf…), or tag, and switch on monitor mode to get alerted the moment a new LLM, model, or dataset starts trending.

This is AI-model trend intelligence, not a static catalog dump: instead of "list every model," it answers "what's gaining adoption on Hugging Face right now." Built for ML engineers, DevRel and LLM-ops teams, AI researchers, and data teams who need to see what's rising across the open-model ecosystem. (Pairs with our Civitai Model & Prompt Trend Tracker for the creative/AI-art side.)

What you get

Both result types land in one dataset, each row tagged with resultType ("model" or "dataset") so you can split them downstream.

FieldDescription
modelId, author, modelNameFull repo id (owner/name), owner, and short name
pipelineTagCanonical task, e.g. text-generation, text-to-image (may be null if untagged)
libraryNameLibrary/framework, e.g. transformers, diffusers, gguf
tagsAll Hub tags on the repo
downloads, likesAdoption metrics
trendingScoreHugging Face's own trending signal (null if not exposed for the chosen sort)
lastModified, createdAtISO 8601 timestamps
isPrivate, gatedVisibility and gating (false / "manual" / "auto")
cardDataModel-card front-matter: license, language, datasets, baseModel
siblingsCount of files in the repo (proxy for model size/complexity)
configconfig.json architecture metadata — only when includeConfig is on
modelUrlDirect Hugging Face link
trendRankPosition in the sorted results (1 = top)
isNewtrue if new to trending since the last monitor run (null in snapshot mode)
scrapedAtISO 8601 timestamp of this run
FieldDescription
datasetId, author, datasetNameFull id (owner/name), owner, and short name
tagsAll Hub tags
downloads, likesAdoption metrics
trendingScoreHugging Face trending signal (null if not exposed)
lastModified, createdAtISO 8601 timestamps
isPrivate, gatedVisibility and gating
cardDataDataset-card front-matter: license, language, size, taskCategories
datasetUrlDirect Hugging Face link
trendRank, isNew, scrapedAtTrend + monitor metadata

Two data types — which to use

  • Trending models (dataType: "models", the default) — what models and LoRAs/checkpoints are gaining adoption. Track model releases, discover emerging architectures, and watch a task or library heat up.
  • Trending datasets (dataType: "datasets") — which training/eval datasets are rising. Discover new corpora and benchmarks before they're everywhere.
  • Both (dataType: "both") — run both in one pass; rows are tagged with resultType.

Set mode: "monitor" and attach an Apify Schedule (daily or weekly):

  • First run stores the current trending IDs and returns everything.
  • Later runs return only entries new to the trending list since the previous run, each flagged isNew: true.
  • If nothing new is trending, the run finishes cleanly with an empty dataset (not an error).

State is keyed per data type and sort (state-models-trending, state-datasets-downloads, …) and persisted in a named Key-Value Store, so it survives across scheduled runs. This is the "alert me when a new model starts trending" workflow — feed it into Slack, email, or a webhook via Apify integrations. Keep your other filters (task, library, tags) stable between scheduled runs so the delta stays meaningful.

Input

The simplest input is no input at all — you get the top 100 trending models:

{}

Trending text-generation models built with transformers, top 50:

{
"dataType": "models",
"sort": "trending",
"pipelineTag": "text-generation",
"library": "transformers",
"limit": 50
}

Monitor new trending datasets every day:

{
"dataType": "datasets",
"mode": "monitor",
"sort": "trending",
"limit": 100
}

Most-downloaded text-to-image models this run, models and datasets together:

{
"dataType": "both",
"sort": "downloads",
"pipelineTag": "text-to-image",
"limit": 100
}
InputDescription
dataTypemodels (default), datasets, or both
modesnapshot (default) or monitor (delta — new entries only)
sorttrending (default), downloads, likes, lastModified, createdAt
direction-1 descending (default, top first) or 1 ascending
pipelineTagTask filter for models, e.g. text-generation (optional)
libraryLibrary filter for models, e.g. diffusers (optional)
searchFree-text search over names, e.g. "llama" (optional)
tagsRestrict to Hub tags, e.g. ["multilingual"] (optional)
limitMax results per data type (default 100, up to 1000; paged automatically)
includeConfigAttach each model's config.json (slower — one extra request per model)
hfTokenOptional Hugging Face token for higher rate limits / gated repos

Output examples

Trending model:

{
"resultType": "model",
"modelId": "meta-llama/Llama-3.3-70B-Instruct",
"author": "meta-llama",
"modelName": "Llama-3.3-70B-Instruct",
"pipelineTag": "text-generation",
"libraryName": "transformers",
"tags": ["text-generation", "transformers", "conversational", "llama"],
"downloads": 1840320,
"likes": 9421,
"trendingScore": 152.4,
"lastModified": "2026-06-15T09:31:00.000Z",
"createdAt": "2026-05-30T09:12:00.000Z",
"isPrivate": false,
"gated": "manual",
"cardData": {
"license": "llama3.3",
"language": ["en"],
"datasets": null,
"baseModel": ["meta-llama/Llama-3.1-70B"]
},
"siblings": 34,
"modelUrl": "https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct",
"trendRank": 1,
"isNew": null,
"scrapedAt": "2026-06-19T10:00:00.000Z"
}

Trending dataset:

{
"resultType": "dataset",
"datasetId": "HuggingFaceFW/fineweb",
"author": "HuggingFaceFW",
"datasetName": "fineweb",
"tags": ["task_categories:text-generation", "size_categories:10B<n<100B", "language:en"],
"downloads": 512094,
"likes": 2103,
"trendingScore": 88.1,
"lastModified": "2026-06-01T00:00:00.000Z",
"createdAt": "2026-04-01T00:00:00.000Z",
"isPrivate": false,
"gated": false,
"cardData": {
"license": "odc-by",
"language": ["en"],
"size": "10B<n<100B",
"taskCategories": ["text-generation"]
},
"datasetUrl": "https://huggingface.co/datasets/HuggingFaceFW/fineweb",
"trendRank": 1,
"isNew": null,
"scrapedAt": "2026-06-19T10:00:00.000Z"
}

Use cases

  • ML engineers tracking which models are gaining real adoption before committing to one.
  • DevRel teams monitoring ecosystem trends for their framework or library.
  • AI researchers spotting emerging architectures, techniques, and base models.
  • LLM-ops teams tracking model popularity to inform deployment and support decisions.
  • Data teams discovering trending training and evaluation datasets.
  • AI newsletters, educators, and content creators covering what's hot on Hugging Face each week.
  • RAG pipeline builders tracking models and datasets worth ingesting.

Limitations

  • trending is Hugging Face's own signal, not just raw downloads — it blends recency and velocity. Use sort: "downloads" or "likes" if you want a pure popularity ranking.
  • Rate limits. The public Hub API is generous but throttles heavy use. The Actor throttles requests and backs off on HTTP 429; increase requestDelayMs, or add an hfToken, if you page large limits.
  • Sparse metadata. Not every repo declares a pipelineTag, library, or full card front-matter — those fields come back null rather than failing the run.
  • Gated/private repos appear in listings with gated/isPrivate set, but their files aren't accessible without an authorized hfToken.
  • includeConfig is slower — it adds one request per model to fetch config.json. Leave it off unless you need architecture details.

Pricing (Pay-Per-Event)

This Actor uses Apify's Pay-Per-Event model — you pay only for what you pull, from ~$4 per 1,000 results:

EventWhen charged
Actor run startOnce per run
Trending model trackedPer trending model returned (in monitor mode, only new-to-trending models)
Trending dataset trackedPer trending dataset returned (in monitor mode, only new-to-trending datasets)

No subscription, no rental. In monitor mode you're charged only for genuinely new trending entries — checking state is free.

Use it from an AI agent (MCP)

This Actor is MCP-ready: run it as a tool from Claude, Cursor, or ChatGPT via Apify's MCP integration to give your agent live "what's trending on Hugging Face" data — trending models, datasets, and their metadata — on demand.

FAQ

Do I need a Hugging Face API key? No. The public Hub API works without one. Add an optional hfToken only to raise rate limits or read gated/private repos you have access to.

What does "trending" mean vs "most downloaded"? trending uses Hugging Face's own trend signal (recent momentum), while downloads/likes are all-time popularity. Pick the sort that matches your question.

Can I schedule it? Yes — set mode: "monitor" and attach an Apify Schedule (daily/weekly) to get only newly trending entries each run.

Can I filter by task or library? Yes — pipelineTag (e.g. text-generation) and library (e.g. diffusers) for models, plus free-text search and Hub tags for both models and datasets.

How are gated models handled? They're included in the output with gated set ("manual"/"auto") so you can see them trending; downloading their files requires an authorized token.

Does it use a browser? No — it calls the official Hugging Face Hub REST API over HTTP only, which keeps runs fast and cheap.