Hugging Face Model Scraper avatar

Hugging Face Model Scraper

Pricing

Pay per usage

Go to Apify Store
Hugging Face Model Scraper

Hugging Face Model Scraper

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny Nguyen

Donny Nguyen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

What does it do?

Hugging Face Model Scraper extracts model information from the Hugging Face Model Hub using their public API. It collects model names, authors, download counts, likes, pipeline tasks, and tags for any task category you specify. Sort results by downloads, likes, or trending to find the most popular models for text generation, image classification, and more.

The actor runs on the Apify platform and delivers clean, structured data ready for export as JSON, CSV, or Excel. It handles retries and proxy rotation automatically. Each run stores results in an Apify Dataset that you can download or connect to your workflow via API.

Why use this actor?

  • API-based extraction: Uses Hugging Face's public API for reliable, structured data
  • Multi-task search: Search across text-generation, text-classification, image-classification, and dozens more tasks
  • Flexible sorting: Sort by downloads, likes, or trending to find the most relevant models
  • Rich metadata: Get download counts, likes, tags, and pipeline information for each model
  • Scalable: Run on schedule with Apify Schedules for regular model hub monitoring
  • Export ready: Download as JSON, CSV, Excel or send to Google Sheets, Slack, webhooks

How to use it

  1. Navigate to the Hugging Face Model Scraper on Apify Store
  2. Configure tasks, sort order, and maximum models to extract
  3. Click Start and wait for results
  4. Export the dataset in your preferred format

You can also run this actor via the Apify API or the Apify JavaScript client.

Input configuration

FieldTypeDescriptionDefault
TasksarrayHugging Face pipeline tasks["text-generation", "text-classification", "image-classification"]
Sort BystringHow to sort results"downloads"
Max ModelsintegerMaximum models to extract500
Proxy ConfigurationobjectProxy settingsApify Proxy

Output data

Each result contains these fields:

FieldTypeDescription
taskStringPipeline task category
nameStringFull model name (author/model)
authorStringModel author or organization
downloadsNumberTotal download count
likesNumberNumber of likes
pipelineStringPipeline tag
tagsStringModel tags (comma-separated)
urlStringLink to model page on Hugging Face
scrapedAtStringTimestamp of extraction

Cost of usage

This actor uses pay-per-event pricing. You are charged $0.30 per 1,000 results extracted. Running with default settings typically costs under $0.05. Adjust input parameters to control the number of models and manage costs.

The actor uses Apify Proxy which is included in your Apify subscription. Memory usage is typically around 256-1024 MB.

Tips and tricks

  • Run on a schedule using Apify Schedules to track model popularity trends
  • Use Apify integrations to send results to Google Sheets, Slack, Zapier, or your database
  • Combine with other actors from the Apify Store for richer data pipelines
  • Use "trending" sort to discover newly popular models in any task category
  • Common tasks include: text-generation, text-classification, image-classification, object-detection, question-answering, summarization, translation, fill-mask, token-classification

Built with Crawlee and the Apify SDK.