HuggingFace Models Scraper
Pricing
from $2.00 / 1,000 model scrapeds
HuggingFace Models Scraper
Scrapes AI/ML models from HuggingFace (huggingface.co/models) via the official API. Extracts model ID, downloads, likes, task type, library, tags, and more. Supports search, author/org filter, pipeline tag filter, and sort order.
Pricing
from $2.00 / 1,000 model scrapeds
Rating
0.0
(0)
Developer
tzmyk
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Scrape AI/ML models from HuggingFace — the world's largest repository of open-source machine learning models.
Extracts structured data including model ID, download counts, likes, task type (pipeline tag), ML library, tags, gated status, and timestamps. Powered by the official HuggingFace API — no web scraping, no rate-limit surprises.
What it does
- Fetches models from the HuggingFace public API with full metadata
- Supports filtering by keyword search, author/organization, task type, and library
- Supports sorting by downloads, likes, or date
- Paginates automatically up to your specified limit (up to 10,000 models)
Use cases
- AI research: Track which models are trending by downloads or likes
- Competitive intelligence: Monitor what models a specific organization has published
- Dataset building: Collect model metadata for ML benchmarks or surveys
- Lead generation: Find organizations actively publishing models in your domain
- Content & newsletters: Curate the most popular or newest models by task type
Input
| Field | Type | Default | Description |
|---|---|---|---|
search | string | — | Keyword search to filter models |
author | string | — | Filter by author or organization (e.g. meta-llama) |
pipelineTag | string | — | Filter by task type (e.g. text-generation, image-classification) |
libraryName | string | — | Filter by ML library (e.g. transformers, diffusers) |
sort | select | downloads | Sort by: downloads, likes, createdAt, lastModified |
maxModels | integer | 100 | Max models to return (1–10,000) |
Example input
{"search": "llama","pipelineTag": "text-generation","sort": "downloads","maxModels": 50}
Output
One record per model saved to the default dataset.
| Field | Type | Description |
|---|---|---|
modelId | string | Full model ID (e.g. meta-llama/Llama-3.1-8B-Instruct) |
author | string|null | Author or organization name |
downloads | number|null | Total download count |
likes | number|null | Like count |
pipelineTag | string|null | Task type (e.g. text-generation) |
libraryName | string|null | ML library (e.g. transformers) |
tags | string[] | All tags including datasets, licenses, frameworks |
gated | boolean|null | Whether model access requires approval |
createdAt | string|null | Creation date (ISO 8601) |
lastModified | string|null | Last modified date (ISO 8601) |
url | string | Direct URL to the model page |
scrapedAt | string | Timestamp when this record was scraped |
Example output
{"modelId": "sentence-transformers/all-MiniLM-L6-v2","author": "sentence-transformers","downloads": 208493944,"likes": 4598,"pipelineTag": "sentence-similarity","libraryName": "sentence-transformers","tags": ["sentence-transformers", "pytorch", "onnx", "license:apache-2.0"],"gated": false,"createdAt": "2022-03-02T23:29:05.000Z","lastModified": "2025-03-06T13:37:44.000Z","url": "https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2","scrapedAt": "2026-03-22T03:46:43.767Z"}
Features
- Official API — Uses the HuggingFace REST API directly; no fragile HTML parsing
- Automatic pagination — Fetches all pages until your limit is reached
- Polite rate limiting — 500ms delay between API calls
- Robust input validation — Clear error messages for invalid inputs
Notes
- Results are limited to public models only; private models are not accessible
- The
gatedfield indicates whether a model requires access approval from the author - HuggingFace API does not support combining
searchwith all sort orders equally;downloadssort works best for broad searches - Download counts are 30-day rolling totals as reported by HuggingFace
Support
Found a bug or have a feature request? Please open an issue or contact the author through the Apify platform.