Hugging Face Spaces Scraper
Pricing
from $7.50 / 1,000 results
Hugging Face Spaces Scraper
Query the Hugging Face Spaces catalog by keyword, author, SDK, and sort order. Records include id, author, SDK, likes, trending score, runtime, hardware, license, tags, created date, and Space URL. Handy for AI model discovery, demo curation, and trend reporting.
Pricing
from $7.50 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share

🤗 Hugging Face Spaces Public API Scraper
🚀 Export Hugging Face Spaces metadata in seconds. ID, author, SDK, likes, trending score, runtime, hardware, license, and tags, straight from the public huggingface.co/api/spaces endpoint.
🕒 Last updated: 2026-06-05 · 📊 14 fields per record · Full Hugging Face Spaces catalogue · Public API, no key required
The Hugging Face Spaces Public API Scraper turns the huggingface.co/api/spaces public REST endpoint into a clean dataset. It calls the API with whatever sort, filter, and search parameters you supply, then flattens each Space into one tidy row.
Coverage spans the full Hugging Face Spaces catalogue, every public Space, every SDK (Gradio, Streamlit, Docker, static). Each row carries id, author, SDK, like count, trending score, runtime stage, hardware tier, creation and last-modified timestamps, tags, and license.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| 🤖 ML engineers | Discover top demo apps for inspiration |
| 📊 Researchers | Track Spaces growth over time |
| 🏢 Platform teams | Build an internal Spaces directory |
| 🧑🎓 Educators | Curate teaching demos |
| 📰 AI journalists | Source trending Spaces stories |
| 👩💻 Developers | Mirror Spaces data into your own DB |
📋 What the Hugging Face Spaces Public API Scraper does
- Calls
/api/spaceswith your chosen sort, search, author, and SDK filters. - Flattens nested runtime and card data into top-level fields.
- Casts numbers to real numbers so they import cleanly into spreadsheets.
- Surfaces upstream errors as a single clean error record.
- Exports as CSV, Excel, JSON, JSONL, XML, RSS, or HTML.
💡 Why it matters: The Hugging Face API is open, but its responses nest runtime and card data several levels deep. This actor flattens everything into a single row per Space, ready for analytics.
🎬 Full Demo
🚧 Coming soon.
⚙️ Input
| Field | Type | Required | Description |
|---|---|---|---|
search | string | No | Free-text search across Spaces. |
maxItems | integer | No | Free users 10, paid up to 1,000,000. Prefill is 10. |
sort | enum | No | likes, trending, createdAt, or lastModified. |
direction | enum | No | Descending or ascending. |
author | string | No | Hugging Face author or organization. |
sdk | enum | No | gradio, streamlit, docker, or static. |
Example 1, top liked Spaces:
{ "sort": "likes", "direction": "-1", "maxItems": 50 }
Example 2, trending Gradio demos:
{ "sort": "trending", "sdk": "gradio", "maxItems": 100 }
⚠️ Good to Know: The Hugging Face API is public and free. No API key is needed. Rate limits are generous but exist, so prefer larger sort-based pulls over many small ones.
📊 Output
| Field | Type | Description |
|---|---|---|
🆔 id | string | Owner/space slug. |
👤 author | string | Owner of the Space. |
🧰 sdk | string | Gradio, Streamlit, Docker, or static. |
❤️ likes | number | Like count. |
🔥 trendingScore | number | Hugging Face trending score. |
📅 createdAt | string | Creation timestamp. |
🕒 lastModified | string | Last modification timestamp. |
🏷️ tags | array | Tags array. |
🚦 runtime | string | Runtime stage. |
🖥️ hardware | string | Hardware tier. |
🔒 private | boolean | Private flag. |
📜 license | string | License string from the Space card. |
🔗 url | string | Direct link. |
🕒 scrapedAt | string | When fetched. |
❌ error | string | Set if the upstream response was an error. |
Sample record:
{"id": "stabilityai/stable-diffusion","author": "stabilityai","sdk": "gradio","likes": 12340,"trendingScore": 87.4,"createdAt": "2022-08-22T12:00:00.000Z","lastModified": "2026-05-20T08:14:55.000Z","tags": ["text-to-image","diffusers"],"runtime": "RUNNING","hardware": "a10g-small","private": false,"license": "creativeml-openrail-m","url": "https://huggingface.co/spaces/stabilityai/stable-diffusion","scrapedAt": "2026-06-05T12:00:00.000Z","error": null}
✨ Why choose this Actor
| 🆓 | Public Hugging Face API, no key needed. | | 🧹 | Flattens nested runtime and card data into one row. | | 🔢 | Casts numbers for clean Excel and pandas imports. | | 🛟 | Surfaces upstream errors as clean rows. | | 🔌 | Sort, search, author, and SDK filters exposed. | | 💾 | Push to dataset for CSV, Excel, JSON, XML, or RSS export. |
📈 How it compares to alternatives
| Approach | Setup | Pagination | Flattening | Export formats |
|---|---|---|---|---|
Raw curl | 5 min | manual | none | manual |
huggingface_hub python | 15 min install | yes | partial | code |
| This Actor | 5 seconds | yes | yes | 7 formats |
🚀 How to use
- Click Try for free.
- Pick a sort and optional filters.
- Click Start. Your dataset is ready in seconds.
💼 Business use cases
🤖 Demo discovery. Find top Gradio demos for your category to benchmark UX.
📊 Catalogue analytics. Track Spaces growth, license distribution, hardware usage.
🏢 Internal directories. Mirror Spaces data into your team wiki for shared discovery.
📰 AI market journalism. Build trending Spaces datasets for a feature.
🔌 Automating Hugging Face Spaces Public API Scraper
- Make / Zapier: schedule a daily run.
- Cron schedule: native Apify scheduler.
- Webhooks: POST on completion.
- Warehouse pipe: native integrations move datasets straight into BigQuery, Snowflake, or Postgres.
🌟 Beyond business use cases
🎓 Education. Curate teaching demos.
🧪 Personal research. Discover what people are building.
🤝 Non-profit and open data. Track open-source AI activity.
🧰 Tinkering and prototyping. Seed a leaderboard or directory site.
🤖 Ask an AI assistant about this scraper
Drop this README into ChatGPT, Claude, or any AI assistant and ask it to design a Spaces analytics pipeline. The input fields, schema, and examples above contain everything an LLM needs.
❓ Frequently Asked Questions
❓ API key needed? No.
❓ How many Spaces? Hundreds of thousands, growing daily.
❓ Filter by SDK? Yes, gradio, streamlit, docker, static.
❓ Filter by author? Yes.
❓ Sort options? Likes, trending, createdAt, lastModified.
❓ Rate limits? Generous public limits.
❓ Excel export? Yes, via the Apify dataset UI.
❓ Schema stability? Core fields are stable.
❓ Scheduling? Yes, via Apify scheduler.
❓ Public data only? Yes.
🔌 Integrate with any app
Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST API or webhook endpoint. Trigger runs from a calendar event, a form submission, a cron job, or pipe results straight into BigQuery, Snowflake, or a Postgres warehouse.
🔗 Recommended Actors
| Actor | What it does |
|---|---|
| ParseForge Hugging Face Collections Scraper | Public Hugging Face collections metadata. |
| ParseForge Hugging Face Discussions Scraper | Discussion threads and PRs on Hugging Face repos. |
| ParseForge ModelScope Models Scraper | ModelScope public models. |
| ParseForge Civitai Models Scraper | Civitai public models. |
💡 Pro Tip: browse the complete ParseForge collection for 900+ production-grade scrapers across business intelligence, real estate, e-commerce, sports, finance, and public records.
Disclaimer. This actor scrapes only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site's terms of service and applicable law. Create a free account w/ $5 credit.