Hugging Face Model Scraper
Pricing
Pay per event
Hugging Face Model Scraper
Collect models from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, tags and filenames.
Pricing
Pay per event
Rating
5.0
(3)
Developer
ParseForge
Actor stats
2
Bookmarked
14
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share

🤖 Hugging Face Model Scraper
🚀 Collect AI model data from Hugging Face Hub in minutes. Search by keyword, task, library, license, or language. Export model names, downloads, likes, tags, and metadata. No coding, no Hugging Face account required.
🕒 Last updated: 2026-04-23 · 📊 20+ fields per model · 🔍 5 search filters · 📊 Download + like counts · 🚫 No auth required
The Hugging Face Model Scraper collects model metadata from the Hugging Face Hub, returning 20+ fields per model: model ID, author, task type, library (PyTorch, TensorFlow, JAX), license, downloads, likes, tags, languages, pipeline tag, and model card URL. Supports keyword search with task, library, license, and language filters.
Hugging Face hosts over 800,000 AI models. This Actor queries the Hub and returns structured data ready for model scouting, benchmarking, or research dashboards.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| ML engineers, AI researchers, data scientists, MLOps teams, AI product managers, VC analysts | Model scouting, benchmarking, competitive analysis, library adoption tracking, license auditing |
📋 What the Hugging Face Model Scraper does
Five search filters:
- 🔍 Keyword search. Free-text search across model names and cards.
- 🎯 Task filter. Text generation, image classification, translation, summarization, and more.
- 📚 Library filter. PyTorch, TensorFlow, JAX, ONNX, Flax, etc.
- 📜 License filter. MIT, Apache 2.0, proprietary, custom licenses.
- 🌐 Language filter. English, Chinese, multilingual, etc.
Each model record includes model ID, author, task, library, license, downloads, likes, tags, languages, pipeline tag, last modified date, and model card URL.
💡 Why it matters: browsing Hugging Face for model comparisons means clicking through hundreds of model cards. This Actor exports structured model metadata at scale for ML benchmarking, scouting, and ecosystem analysis.
🎬 Full Demo
🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.
⚙️ Input
| Input | Type | Default | Behavior |
|---|---|---|---|
query | string | "" | Search across model names and cards. |
task | string | "" | Task type: text-generation, image-classification, etc. |
library | string | "" | ML library: transformers, diffusers, ONNX, etc. |
license | string | "" | License: MIT, Apache 2.0, etc. |
language | string | "" | Model language: en, zh, multilingual. |
Example: text generation models with Apache 2.0 license.
{"query": "llama","task": "text-generation","license": "apache-2.0"}
Example: image classification models in PyTorch.
{"task": "image-classification","library": "transformers"}
⚠️ Good to Know: Hugging Face model downloads can change rapidly. Each run captures a point-in-time snapshot of the Hub's metadata.
📊 Output
Each model record contains 20+ fields. Download the dataset as CSV, Excel, JSON, or XML.
🧾 Schema
| Field | Type | Example |
|---|---|---|
🤖 modelId | string | "meta-llama/Llama-3-8B" |
👤 author | string | "meta-llama" |
🎯 task | string | "text-generation" |
📚 library | string | "transformers" |
📜 license | string | "llama3" |
📊 downloads | number | 5000000 |
👍 likes | number | 12000 |
🏷️ tags | array | ["llm", "text-generation"] |
🌐 languages | array | ["en"] |
🏷️ pipelineTag | string | "text-generation" |
📅 lastModified | string | "2026-03-15T10:30:00Z" |
📂 modelCardUrl | string | "https://huggingface.co/meta-llama/..." |
🔗 url | string | "https://huggingface.co/meta-llama/Llama-3-8B" |
🕒 scrapedAt | ISO 8601 | "2026-04-16T00:00:00.000Z" |
📦 Sample records
✨ Why choose this Actor
| Capability | |
|---|---|
| 🤖 | 800,000+ models. Full Hugging Face Hub coverage. |
| 🔍 | 5 search filters. Keyword, task, library, license, language. |
| 📊 | Popularity metrics. Downloads and likes per model. |
| 📜 | License data. License type per model for compliance auditing. |
| 📚 | Library tracking. PyTorch, TensorFlow, JAX adoption data. |
| ⚡ | Scalable. From single lookups to full task-type sweeps. |
| 🚫 | No authentication. Public Hub API. |
📊 Hugging Face hosts over 800,000 AI models. Structured access to this metadata powers every ML benchmarking, model scouting, and ecosystem analysis workflow.
📈 How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| ⭐ Hugging Face Model Scraper (this Actor) | $5 free credit, then pay-per-use | Full Hub | Live per run | keyword, task, library, license, language | ⚡ 2 min |
| Hugging Face Hub API (direct) | Free with rate limits | Full | Real-time | Many | ⏳ Hours (API client setup) |
| Manual Hub browsing | Free | Manual | Manual | UI only | 🕒 Hours per category |
| Paid AI model databases | $100-1,000/month | Multi-source | Varies | Many | 🐢 Days |
Pick this Actor when you want Hugging Face model data on demand, with task and library filters, without writing API client code.
🚀 How to use
- 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
- 🌐 Open the Actor. Go to the Hugging Face Model Scraper page on the Apify Store.
- 🎯 Set input. Enter keywords, pick a task and library.
- 🚀 Run it. Click Start and let the Actor collect your data.
- 📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.
⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.
💼 Business use cases
🌟 Beyond business use cases
Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.
🤖 Ask an AI assistant about this scraper
Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:
- 💬 ChatGPT
- 🧠 Claude
- 🔍 Perplexity
- 🅒 Copilot
❓ Frequently Asked Questions
🔌 Automating Hugging Face Model Scraper
Control the scraper programmatically for scheduled runs and pipeline integrations:
- 🟢 Node.js. Install the
apify-clientNPM package. - 🐍 Python. Use the
apify-clientPyPI package. - 📚 See the Apify API documentation for full details.
The Apify Schedules feature lets you trigger this Actor on any cron interval. Weekly pulls keep your model intelligence database fresh.
🔌 Integrate with any app
Hugging Face Model Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get run notifications in your channels
- Airbyte - Pipe model data into your warehouse
- GitHub - Trigger runs from commits and releases
- Google Drive - Export datasets straight to Sheets
You can also use webhooks to trigger downstream actions when a run finishes. Push fresh model data into your ML platform, or alert your team in Slack.
🔗 Recommended Actors
- 📚 Semantic Scholar Scraper - Academic paper data
- 📝 HTML to JSON Smart Parser - AI data extraction
- 🎤 Audio Transcriber - Speech to text
- 🎨 Modern Manga Colorizer - AI manga colorization
- 💼 HubSpot Marketplace Scraper - Marketplace data
💡 Pro Tip: browse the complete ParseForge collection for more AI and data scrapers.
🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.
⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Hugging Face, Inc. All trademarks mentioned are the property of their respective owners. Only publicly available Hub metadata is collected.