Hugging Face Model Scraper
Pricing
Pay per usage
Hugging Face Model Scraper
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Donny Nguyen
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
What does it do?
Hugging Face Model Scraper extracts model information from the Hugging Face Model Hub using their public API. It collects model names, authors, download counts, likes, pipeline tasks, and tags for any task category you specify. Sort results by downloads, likes, or trending to find the most popular models for text generation, image classification, and more.
The actor runs on the Apify platform and delivers clean, structured data ready for export as JSON, CSV, or Excel. It handles retries and proxy rotation automatically. Each run stores results in an Apify Dataset that you can download or connect to your workflow via API.
Why use this actor?
- API-based extraction: Uses Hugging Face's public API for reliable, structured data
- Multi-task search: Search across text-generation, text-classification, image-classification, and dozens more tasks
- Flexible sorting: Sort by downloads, likes, or trending to find the most relevant models
- Rich metadata: Get download counts, likes, tags, and pipeline information for each model
- Scalable: Run on schedule with Apify Schedules for regular model hub monitoring
- Export ready: Download as JSON, CSV, Excel or send to Google Sheets, Slack, webhooks
How to use it
- Navigate to the Hugging Face Model Scraper on Apify Store
- Configure tasks, sort order, and maximum models to extract
- Click Start and wait for results
- Export the dataset in your preferred format
You can also run this actor via the Apify API or the Apify JavaScript client.
Input configuration
| Field | Type | Description | Default |
|---|---|---|---|
| Tasks | array | Hugging Face pipeline tasks | ["text-generation", "text-classification", "image-classification"] |
| Sort By | string | How to sort results | "downloads" |
| Max Models | integer | Maximum models to extract | 500 |
| Proxy Configuration | object | Proxy settings | Apify Proxy |
Output data
Each result contains these fields:
| Field | Type | Description |
|---|---|---|
| task | String | Pipeline task category |
| name | String | Full model name (author/model) |
| author | String | Model author or organization |
| downloads | Number | Total download count |
| likes | Number | Number of likes |
| pipeline | String | Pipeline tag |
| tags | String | Model tags (comma-separated) |
| url | String | Link to model page on Hugging Face |
| scrapedAt | String | Timestamp of extraction |
Cost of usage
This actor uses pay-per-event pricing. You are charged $0.30 per 1,000 results extracted. Running with default settings typically costs under $0.05. Adjust input parameters to control the number of models and manage costs.
The actor uses Apify Proxy which is included in your Apify subscription. Memory usage is typically around 256-1024 MB.
Tips and tricks
- Run on a schedule using Apify Schedules to track model popularity trends
- Use Apify integrations to send results to Google Sheets, Slack, Zapier, or your database
- Combine with other actors from the Apify Store for richer data pipelines
- Use "trending" sort to discover newly popular models in any task category
- Common tasks include: text-generation, text-classification, image-classification, object-detection, question-answering, summarization, translation, fill-mask, token-classification
Useful Links
- Apify Platform
- Crawlee Documentation
- Apify SDK
- Hugging Face Model Hub
- More actors by consummate_mandala
- GitHub - donnywin85