
Hugging Face Model Scraper
Pricing
Pay per event

Hugging Face Model Scraper
Collect models from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, tags and filenames.
0.0 (0)
Pricing
Pay per event
0
2
2
Last modified
5 days ago
π€ Hugging Face Intelligence Scraper (Models)
Collect model intelligence from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, tags and filenames. Built for analysts, researchers, and developers who need fast insights with no browser automation.
π― What does it collect?
β
Model id, name, URL
β
Author
β
Downloads, likes
β
Last modified, createdAt
β
Task (pipeline tag), library
β
License, tags
How to use
[YouTube video embed or link]
Example run: query βbertβ, 20 items, sorted by downloads.
Input
Fields supported:
query
string β free text searchtask
string β e.g.,text-classification
,image-classification
,text-generation
library
string β e.g.,transformers
,diffusers
,timm
license
string β e.g.,apache-2.0
,mit
,cc-by-4.0
language
string β e.g.,en
,zh
,multi
sort
enum βdownloads
|likes
|lastModified
|trending
direction
enum βasc
|desc
maxItems
integer β max models to return
Here's what the filled-out input schema looks like:
And here it is written in JSON:
{"query": "bert","sort": "downloads","direction": "desc","maxItems": 100}
Pro Tip: Combine multiple filters to narrow down results. For example, search for "bert" models with task "text-classification" and library "transformers" for highly targeted results.
Output
After the Actor finishes its run, you'll get a dataset with the output. The length of the dataset depends on the amount of results you've set. You can download those results as CSV, Excel, or JSON.
Here's an example of scraped Hugging Face model data:
{"id": "google-bert/bert-base-uncased","name": "google-bert/bert-base-uncased","url": "https://huggingface.co/google-bert/bert-base-uncased","author": "google-bert","downloads": 54018364,"likes": 2423,"private": false,"gated": false,"disabled": false,"sha": "86b5e0934494bd15c9632b12f734a8a67f723594","lastModified": "2024-02-19T11:06:12.000Z","createdAt": "2022-03-02T23:29:04.000Z","task": "fill-mask","library": "transformers","license": "apache-2.0","language": ["en"],"datasets": ["bookcorpus", "wikipedia"],"tags": ["exbert"],"files": [".gitattributes","LICENSE","README.md","config.json","model.safetensors","pytorch_model.bin","tokenizer.json","tokenizer_config.json","vocab.txt"]}
What You Get: Complete model metadata including popularity metrics (downloads, likes), technical details (task, library, license), training information (datasets, language), and available model files.
Download Options: CSV, Excel, or JSON formats for easy analysis in your business tools
β‘ Why choose this scraper?
β
API-first, fast: Uses Hugging Face public API endpoints (no browser)
β
Flexible filtering: query, task, library, license, language, sorting
β
Comprehensive data: Get downloads, likes, tasks, licenses, files, and more
β
User-Friendly: No coding neededβjust set filters and go
β° Time Savings: Save hours compared to manual model research and tracking
π° Cost Efficiency: Fraction of the cost of maintaining custom tracking infrastructure
π§ How to use
- π Sign Up: Create a free Apify account (takes 2 minutes)
- π Find the Scraper: Visit the Hugging Face Intelligence Scraper page
- βοΈ Set Input: Add your filters and max items
- π Run It: Click "Start" and let it collect your data
- π₯ Download Data: Get your results in the "Dataset" tab as CSV, Excel, or JSON
β±οΈ Total Time: 5 minutes setup, 10-30 minutes for data collection
π― No Technical Skills Required: Everything is point-and-click
Business Use Cases
AI/ML Researchers:
- Track trending models in your research area
- Monitor model performance metrics (downloads, likes)
- Identify popular architectures and libraries
- Discover datasets used for training
ML Engineers:
- Find production-ready models for specific tasks
- Compare models by popularity and recency
- Identify licensing requirements before deployment
- Track model updates and new releases
Data Scientists:
- Build comprehensive model catalogs
- Analyze AI/ML trends and adoption patterns
- Identify suitable pre-trained models for projects
- Monitor emerging techniques and libraries
Product Managers:
- Track competitive AI/ML landscape
- Monitor adoption of different model types
- Identify popular solutions for product features
- Support AI strategy with market intelligence
Integrate with any app and automate your workflow
Hugging Face Intelligence Scraper can be connected with almost any cloud service or web app thanks to integrations on the Apify platform.
These includes:
Alternatively, you can use webhooks to carry out an action whenever an event occurs, e.g. get a notification whenever a run successfully finishes.
Using with the Apify API
For advanced users who want to automate this process, you can control the scraper programmatically with the Apify API. This allows you to schedule regular data collection and integrate with your existing business tools.
- Node.js: Install the apify-client NPM package
- Python: Use the apify-client PyPI package
- See the Apify API reference for full details
π° Pricing
- Start price: $0.005 per run
- Price per 1,000 results: $5.00 (i.e., $0.005 per result)
Non-paying users must set maxItems
(max 100). Paying users can set up to 1,000,000, and if not defined, maxItems
is unlimited.
Frequently Asked Questions
Q: How accurate is the data? A: We collect data directly from Hugging Face's public API in real-time, ensuring the most up-to-date and accurate information available.
Q: Can I schedule regular runs? A: Yes! Use the Apify scheduler or API to schedule daily, weekly, or monthly runs automatically. Perfect for tracking model trends over time.
Q: What's the rate limit? A: We respect Hugging Face's API limits. The scraper handles rate limiting automatically.
Q: Can I get model descriptions and READMEs? A: Currently, the scraper focuses on metadata. For full READMEs, you can use the model URLs provided in the output.
Q: What if I need help? A: Our support team is available. Contact us through the Apify platform.
Q: Is my data secure? A: Absolutely. All data is encrypted in transit and at rest. We never share your data with third parties.
Need Help? Our support team is here to help you get the most out of this tool.
On this page
Share Actor: