Hugging Face Model Scraper avatar

Hugging Face Model Scraper

Pricing

Pay per event

Go to Apify Store
Hugging Face Model Scraper

Hugging Face Model Scraper

Collect models from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, tags and filenames.

Pricing

Pay per event

Rating

5.0

(3)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

2

Bookmarked

14

Total users

1

Monthly active users

2 days ago

Last modified

Share

ParseForge Banner

🤖 Hugging Face Model Scraper

🚀 Collect AI model data from Hugging Face Hub in minutes. Search by keyword, task, library, license, or language. Export model names, downloads, likes, tags, and metadata. No coding, no Hugging Face account required.

🕒 Last updated: 2026-04-23 · 📊 20+ fields per model · 🔍 5 search filters · 📊 Download + like counts · 🚫 No auth required

The Hugging Face Model Scraper collects model metadata from the Hugging Face Hub, returning 20+ fields per model: model ID, author, task type, library (PyTorch, TensorFlow, JAX), license, downloads, likes, tags, languages, pipeline tag, and model card URL. Supports keyword search with task, library, license, and language filters.

Hugging Face hosts over 800,000 AI models. This Actor queries the Hub and returns structured data ready for model scouting, benchmarking, or research dashboards.

🎯 Target Audience💡 Primary Use Cases
ML engineers, AI researchers, data scientists, MLOps teams, AI product managers, VC analystsModel scouting, benchmarking, competitive analysis, library adoption tracking, license auditing

📋 What the Hugging Face Model Scraper does

Five search filters:

  • 🔍 Keyword search. Free-text search across model names and cards.
  • 🎯 Task filter. Text generation, image classification, translation, summarization, and more.
  • 📚 Library filter. PyTorch, TensorFlow, JAX, ONNX, Flax, etc.
  • 📜 License filter. MIT, Apache 2.0, proprietary, custom licenses.
  • 🌐 Language filter. English, Chinese, multilingual, etc.

Each model record includes model ID, author, task, library, license, downloads, likes, tags, languages, pipeline tag, last modified date, and model card URL.

💡 Why it matters: browsing Hugging Face for model comparisons means clicking through hundreds of model cards. This Actor exports structured model metadata at scale for ML benchmarking, scouting, and ecosystem analysis.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.


⚙️ Input

InputTypeDefaultBehavior
querystring""Search across model names and cards.
taskstring""Task type: text-generation, image-classification, etc.
librarystring""ML library: transformers, diffusers, ONNX, etc.
licensestring""License: MIT, Apache 2.0, etc.
languagestring""Model language: en, zh, multilingual.

Example: text generation models with Apache 2.0 license.

{
"query": "llama",
"task": "text-generation",
"license": "apache-2.0"
}

Example: image classification models in PyTorch.

{
"task": "image-classification",
"library": "transformers"
}

⚠️ Good to Know: Hugging Face model downloads can change rapidly. Each run captures a point-in-time snapshot of the Hub's metadata.


📊 Output

Each model record contains 20+ fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
🤖 modelIdstring"meta-llama/Llama-3-8B"
👤 authorstring"meta-llama"
🎯 taskstring"text-generation"
📚 librarystring"transformers"
📜 licensestring"llama3"
📊 downloadsnumber5000000
👍 likesnumber12000
🏷️ tagsarray["llm", "text-generation"]
🌐 languagesarray["en"]
🏷️ pipelineTagstring"text-generation"
📅 lastModifiedstring"2026-03-15T10:30:00Z"
📂 modelCardUrlstring"https://huggingface.co/meta-llama/..."
🔗 urlstring"https://huggingface.co/meta-llama/Llama-3-8B"
🕒 scrapedAtISO 8601"2026-04-16T00:00:00.000Z"

📦 Sample records


✨ Why choose this Actor

Capability
🤖800,000+ models. Full Hugging Face Hub coverage.
🔍5 search filters. Keyword, task, library, license, language.
📊Popularity metrics. Downloads and likes per model.
📜License data. License type per model for compliance auditing.
📚Library tracking. PyTorch, TensorFlow, JAX adoption data.
Scalable. From single lookups to full task-type sweeps.
🚫No authentication. Public Hub API.

📊 Hugging Face hosts over 800,000 AI models. Structured access to this metadata powers every ML benchmarking, model scouting, and ecosystem analysis workflow.


📈 How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ Hugging Face Model Scraper (this Actor)$5 free credit, then pay-per-useFull HubLive per runkeyword, task, library, license, language⚡ 2 min
Hugging Face Hub API (direct)Free with rate limitsFullReal-timeMany⏳ Hours (API client setup)
Manual Hub browsingFreeManualManualUI only🕒 Hours per category
Paid AI model databases$100-1,000/monthMulti-sourceVariesMany🐢 Days

Pick this Actor when you want Hugging Face model data on demand, with task and library filters, without writing API client code.


🚀 How to use

  1. 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the Hugging Face Model Scraper page on the Apify Store.
  3. 🎯 Set input. Enter keywords, pick a task and library.
  4. 🚀 Run it. Click Start and let the Actor collect your data.
  5. 📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.


💼 Business use cases

🤖 ML Engineering & Research

  • Scout models for specific tasks
  • Compare download trends across architectures
  • Track new model releases by library
  • Audit license compliance for production use

📊 AI Market Intelligence

  • Track AI model ecosystem growth
  • Analyze library adoption rates (PyTorch vs TF)
  • Monitor competitor model releases
  • Build AI investment thesis datasets

🏢 MLOps & Platform Teams

  • Build internal model catalogs
  • Track community model updates
  • Monitor license changes across dependencies
  • Benchmark model sizes and performance

💼 VC & Strategy Teams

  • Map the AI model landscape by task
  • Track emerging architectures and frameworks
  • Analyze open-source AI momentum
  • Build competitive maps of AI model builders


🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

  • Empirical datasets for papers, thesis work, and coursework
  • Longitudinal studies tracking changes across snapshots
  • Reproducible research with cited, versioned data pulls
  • Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

  • Side projects, portfolio demos, and indie app launches
  • Data visualizations, dashboards, and infographics
  • Content research for bloggers, YouTubers, and podcasters
  • Hobbyist collections and personal trackers

🤝 Non-profit and civic

  • Transparency reporting and accountability projects
  • Advocacy campaigns backed by public-interest data
  • Community-run databases for local issues
  • Investigative journalism on public records

🧪 Experimentation

  • Prototype AI and machine-learning pipelines with real data
  • Validate product-market hypotheses before engineering spend
  • Train small domain-specific models on niche corpora
  • Test dashboard concepts with live input

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

❓ Frequently Asked Questions


🔌 Automating Hugging Face Model Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

  • 🟢 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • 📚 See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Weekly pulls keep your model intelligence database fresh.

🔌 Integrate with any app

Hugging Face Model Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get run notifications in your channels
  • Airbyte - Pipe model data into your warehouse
  • GitHub - Trigger runs from commits and releases
  • Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh model data into your ML platform, or alert your team in Slack.


💡 Pro Tip: browse the complete ParseForge collection for more AI and data scrapers.


🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Hugging Face, Inc. All trademarks mentioned are the property of their respective owners. Only publicly available Hub metadata is collected.