Deprecated

Pricing

Pay per event

See alternative Actors

Go to Apify Store

Hugging Face Spaces Scraper

Deprecated

See alternative Actors

Scrape Hugging Face Spaces: get space IDs, SDKs, likes, tags, authors, descriptions, and live URLs. Filter by SDK, author, or tag. Sort by likes or trending.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

5 days ago

Last modified

What does it do?

This actor scrapes the public Hugging Face Spaces catalog using the official Hugging Face REST API. It extracts metadata about AI demo spaces: who built them, what SDK they use, how many likes they have, their tags, descriptions, and direct URLs — both to the HuggingFace profile page and to the live running space.

You can filter results by:

🔍 Search query — keyword matching across space names and descriptions
🛠 SDK — Gradio, Streamlit, Docker, or Static
👤 Author — spaces from a specific user or organization (e.g. stabilityai, google)
🏷 Tag — any HuggingFace tag (e.g. language:en, task:text-generation, license:mit)
📊 Sort — by likes, trending score, creation date, or last modified date

Who is it for?

🧑‍🔬 AI researchers tracking the ecosystem

Scan the Spaces catalog for specific task types or SDKs. Monitor which new demos are trending in your research area — computer vision, NLP, audio, or multimodal AI.

🏢 Product teams and competitive analysts

Track AI tool releases from competitor organizations. Watch what spaces a specific company (e.g. Meta, Google, Stability AI) is publishing and when they go live.

📊 Data scientists building ML registries

Populate databases of publicly available AI demos. Build search engines, comparison tools, or leaderboards on top of the extracted space metadata.

🤖 Developers building AI-powered pipelines

Feed space metadata into LLM workflows, vector databases, or recommendation systems that surface relevant AI tools to end users.

🎓 Educators and AI course creators

Discover teaching demos and Gradio apps for specific ML tasks. Find spaces with the most likes for a given topic to recommend to students.

Why use it?

✅ No API key needed — the Hugging Face public API is open
✅ Full metadata — ID, author, SDK, likes, all tags, description, live space URL, dates
✅ Flexible filtering — combine search, SDK, author, and tag filters simultaneously
✅ Fast and cheap — HTTP-only, no browser, low compute cost
✅ Pagination handled — fetch hundreds or thousands of spaces in one run
✅ Follows rate limits — exponential backoff and retry logic built in

Data extracted

Field	Type	Description
`id`	string	Full space ID in `owner/space-name` format
`author`	string	Author or organization username
`spaceName`	string	Space name without owner prefix
`sdk`	string	Framework: `gradio`, `streamlit`, `docker`, or `static`
`likes`	number	Number of likes on this space
`tags`	array	All tags (SDK, region, tasks, languages, etc.)
`description`	string	Short description from the space README card
`cardTitle`	string	Display title from the README card
`license`	string	License identifier (e.g. `apache-2.0`, `mit`)
`createdAt`	string	ISO 8601 creation timestamp
`lastModified`	string	ISO 8601 last modified timestamp
`spaceUrl`	string	Live running space URL (e.g. `https://xxx.hf.space`)
`url`	string	HuggingFace profile page URL
`scrapedAt`	string	ISO 8601 timestamp of when data was scraped

How to use it

Step 1 — Open the actor

Go to Hugging Face Spaces Scraper on Apify Store and click Try for free.

Step 2 — Configure your input

Set your filters in the input form:

Search query — enter a keyword (e.g. "text to speech") or leave empty to browse all spaces
SDK — choose a framework or leave as "All SDKs"
Author — enter an organization name to scrape their spaces (e.g. facebook)
Tag — enter a HuggingFace tag filter (e.g. language:en)
Sort by — choose likes, trending, or date
Max results — set how many spaces to extract

Step 3 — Run and export

Click Start. When the run completes, download results as JSON, CSV, or Excel from the Dataset tab.

Input parameters

Parameter	Type	Default	Description
`searchQuery`	string	`""`	Keyword to search across space names and descriptions
`sdk`	string	`""`	Filter by SDK: `gradio`, `streamlit`, `docker`, `static`, or empty for all
`author`	string	`""`	Filter by author/organization username
`tag`	string	`""`	Filter by tag (e.g. `license:mit`, `language:fr`)
`sortBy`	string	`likes`	Sort by: `likes`, `createdAt`, `lastModified`, `trendingScore`
`maxResults`	integer	`100`	Maximum spaces to extract (use `0` for unlimited)
`batchSize`	integer	`100`	Items per API request (max 100)
`maxRetries`	integer	`3`	Retries on failed requests

Output example

{
  "id": "stabilityai/stable-diffusion",
  "author": "stabilityai",
  "spaceName": "stable-diffusion",
  "sdk": "gradio",
  "likes": 8432,
  "tags": ["gradio", "region:us", "license:creativeml-openrail-m"],
  "private": false,
  "description": "Stable Diffusion is a state-of-the-art text-to-image model",
  "cardTitle": "Stable Diffusion",
  "license": "creativeml-openrail-m",
  "createdAt": "2022-08-22T13:00:00.000Z",
  "lastModified": "2024-01-15T10:32:00.000Z",
  "spaceUrl": "https://stabilityai-stable-diffusion.hf.space",
  "url": "https://huggingface.co/spaces/stabilityai/stable-diffusion",
  "scrapedAt": "2026-04-28T09:00:00.000Z"
}

Tips and tricks

💡 Trending spaces — use sortBy: trendingScore to discover what the community is excited about right now
💡 Combine filters — all filters are applied simultaneously (AND logic), so sdk=gradio + author=google returns only Google's Gradio spaces
💡 Scrape an org's whole portfolio — set author=huggingface (or any org) and maxResults=0 to get all their public spaces
💡 Tag syntax — HuggingFace tags use colon notation: language:en, license:apache-2.0, task:image-classification
💡 Monitor new releases — sort by createdAt to see the newest spaces first
💡 Find live demos — the spaceUrl field gives you the running app URL you can embed or test directly

How much does it cost to scrape Hugging Face Spaces?

The actor uses pay-per-event (PPE) pricing — you are charged per space extracted, not per run minute.

Plan	Price per space
Free	$0.00115
Bronze	$0.001
Silver	$0.00078
Gold	$0.00060
Platinum	$0.00040
Diamond	$0.00028

Estimate:

100 spaces → ~$0.10 (Bronze)
1,000 spaces → ~$1.00 (Bronze)
10,000 spaces → ~$10 (Bronze) or ~$2.80 (Diamond)

There is also a small one-time start fee of $0.005 per run.

HuggingFace's public Spaces catalog has 500,000+ spaces. A full catalog scrape at Diamond tier costs approximately $140.

Free plan: Apify's free plan includes $5 in monthly credits — enough to scrape ~4,300 spaces at Bronze pricing.

Integrations

Export to Google Sheets

Use the Export to Google Sheets integration in Apify Console to automatically write extracted spaces to a spreadsheet. Perfect for tracking an organization's space portfolio or building a weekly trending report.

Airtable database of AI demos

Connect to Airtable using Apify's built-in webhooks. Each run can append new spaces to a base, enabling you to build a curated database of AI tools with custom fields and views.

Schedule a daily run filtering by sortBy: trendingScore and pipe results to Slack using an Apify webhook → Zapier → Slack flow. Get alerted to new viral demos every morning.

LLM-powered space categorization

Export JSON results and feed to an LLM (Claude, GPT-4) to auto-categorize spaces by use case, assign difficulty ratings, or summarize what each space does — then import back to your own database.

Vector search over space descriptions

Embed the description and cardTitle fields using an embedding model and store in Pinecone or Weaviate. Enable semantic search over the full HuggingFace Spaces catalog.

API usage

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });

const run = await client.actor('automation-lab/huggingface-spaces-scraper').call({
    searchQuery: 'text to image',
    sdk: 'gradio',
    sortBy: 'likes',
    maxResults: 100,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Extracted ${items.length} spaces`);
console.log(items[0]);

Python

from apify_client import ApifyClient

client = ApifyClient(token='YOUR_APIFY_TOKEN')

run = client.actor('automation-lab/huggingface-spaces-scraper').call(run_input={
    'searchQuery': 'text to image',
    'sdk': 'gradio',
    'sortBy': 'likes',
    'maxResults': 100,
})

dataset = client.dataset(run['defaultDatasetId']).list_items()
for item in dataset['items']:
    print(item['id'], item['likes'])

cURL

# Start a run
curl -X POST "https://api.apify.com/v2/acts/automation-lab~huggingface-spaces-scraper/runs" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"searchQuery":"text to image","sdk":"gradio","maxResults":50}'

# Get results (replace DATASET_ID with run's defaultDatasetId)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?limit=100" \
  -H "Authorization: Bearer YOUR_APIFY_TOKEN"

Use with Claude and MCP

You can query this scraper directly from Claude Code, Claude Desktop, Cursor, or VS Code using the Apify MCP server.

Claude Code (terminal)

$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/huggingface-spaces-scraper"

Claude Desktop / Cursor / VS Code

Add to your MCP config:

{
  "mcpServers": {
    "apify": {
      "command": "npx",
      "args": ["-y", "@apify/mcp-server"],
      "env": {
        "APIFY_TOKEN": "YOUR_APIFY_TOKEN",
        "ACTORS": "automation-lab/huggingface-spaces-scraper"
      }
    }
  }
}

Example prompts

"Find the top 50 most-liked Gradio spaces for image generation"
"Get all spaces published by stabilityai and export as CSV"
"Show me spaces that were created in the last 7 days, sorted by trending score"
"List all Docker-based spaces with more than 1000 likes"

Legality — Is it legal to scrape Hugging Face Spaces?

This actor uses the official, publicly documented Hugging Face REST API (https://huggingface.co/api/spaces) — the same API that powers the HuggingFace website itself. There is no scraping of HTML, no bypassing of rate limits, and no collection of private or authenticated data.

Only public spaces are accessible (private spaces require authentication which this actor does not use). The extracted metadata is publicly visible to any visitor on huggingface.co.

Always review HuggingFace's Terms of Service and their API usage policies before using this data commercially.

FAQ

Q: Does this require a HuggingFace API key? A: No. The Spaces API is publicly accessible without authentication. No API key is needed.

Q: Can I scrape private spaces? A: No. This actor only accesses public spaces. Private spaces require authentication which is not supported.

Q: How many spaces can I scrape? A: HuggingFace has 500,000+ public spaces. Set maxResults: 0 for an unlimited run that will fetch all available spaces matching your filters.

Q: Why are some descriptions empty? A: Not all spaces have a short_description in their README card. Many spaces (especially older ones) were created without a card description. The cardTitle field is more consistently populated.

Q: I'm getting fewer results than expected — what's wrong? A: The HuggingFace API may return fewer results when combining strict filters. Try relaxing one filter at a time. Also check that your tag syntax uses HuggingFace's colon format (e.g. license:mit not just mit).

Q: The actor timed out — what should I do? A: Increase the timeout in Advanced Settings, or reduce maxResults to scrape in smaller batches. For very large runs (10,000+ spaces), consider splitting by SDK type and merging results.

Q: How do I get spaces from a specific task category? A: Use the tag filter with a task tag (e.g. tag: task:text-generation). You can find valid tag values by browsing the HuggingFace Spaces UI and checking the filter panel.

Looking for more HuggingFace data? Check out our other automation-lab scrapers:

🤖 Hugging Face Scraper — models, datasets, spaces, and papers in one actor
📄 Hugging Face Papers Scraper — research papers with abstracts, authors, and upvotes
📦 Hugging Face Datasets Scraper — ML datasets with downloads, licenses, and metadata

Hugging Face Scraper - Models Datasets Spaces

openclawmara/huggingface-scraper

Scrape Hugging Face models, datasets, and Spaces. Extracts metadata, downloads, likes, tags, and usage stats. Ideal for AI model discovery, competitive analysis, and tracking trending ML resources.

OpenClaw Mara

HuggingFace Scraper (All-in-One) 🚀🤗🔎

scrapestorm/huggingface-scraper-all-in-one

🟠 Easily collect Models, Datasets & Spaces from Hugging Face Provide one or multiple search keywords and extract data across the entire HuggingFace ecosystem including Repository name 👤 Owner 🔗 Source search URL & more… Perfect for AI architecture research & full ecosystem intelligence 🚀🤖

Storm_Scraper

5.0

(1)

HuggingFace Hub Scraper - Models, Datasets, Spaces

wetyr_corporation/huggingface-hub-scraper

Bulk extract AI models, datasets, and Spaces from HuggingFace. Filter by task, library, license, author. Pulls downloads, likes, tags, model cards.

WETYR

HuggingFace Hub Scraper

devilscrapes/huggingface-hub-scraper

Export models, datasets, and Spaces from HuggingFace Hub. Filter by task, library, or author. Trending snapshot mode. No login needed. Richer schema than incumbents.

DevilScrapes

HuggingFace Hub Scraper - Models, Datasets, Spaces & Authors

makework36/huggingface-hub-scraper

Scrape HuggingFace Hub: models, datasets, spaces. 30+ fields per record, trending filters, author profiles, parsed tags, web enrichment for emails & websites.

deusex machine

Hugging Face Scraper — AI Models, Datasets, Spaces & Papers

logiover/huggingface-hub-intelligence-scraper

Export every AI model, dataset, space and daily paper from the Hugging Face Hub. Filter by task, library (transformers, diffusers, GGUF), language, license, author. Sort by downloads, likes, trending. Sibling files + README. Public HF API, no token. For AI builders, ML research, RAG and VC AI intel.

Logiover

HuggingFace Hub Scraper

crawlerbros/huggingface-scraper

Scrape Hugging Face Hub, search and fetch models, datasets, and spaces with full metadata: downloads, likes, license, pipeline tag, library, tags, files, and more. Pure HTTP, no auth required.

Crawler Bros

5.0

(17)

AI Tools & Models Intelligence

swerve/aitools-intel-scraper

Aggregate AI tool and model data from Futurepedia (29 tool categories) and Hugging Face Hub (models, datasets, spaces). Used by AI investors tracking new launches, founders sizing markets, recruiters sourcing AI talent, and SaaS competitive-intelligence teams benchmarking AI product features.

Swerve

Kaggle Dataset Scraper — Search, Metadata & Trending

openclawmara/kaggle-dataset-scraper

Scrape Kaggle datasets marketplace. Modes: search by keyword/tag, dataset details (owner, license, file list, size, votes, downloads), trending, and user profiles. Extracts titles, descriptions, updated dates, usability scores. Ideal for ML dataset discovery and competitive landscape research.

OpenClaw Mara

Huggingface Ai Scraper

skystone_labs/huggingface-ai-scraper

Extract AI/ML models, datasets, and spaces from Hugging Face with comprehensive metadata. Get download counts, likes, tags, task categories, library frameworks, and author information. Perfect for AI researchers, ML engineers, and data scientists tracking the open-source AI ecosystem.

Skystone

Hugging Face Datasets Catalog — ML Training Data Intel

nexgendata/huggingface-datasets-catalog

Hugging Face dataset registry: downloads, likes, last_modified, task_categories, language, size_categories, license, tags, author. Filter by task/language/size. Sort by downloads/likes/trending/modified. ML researchers, MLOps, AI compliance.

Stephan Corbeil

Hugging Face Spaces Scraper

What does it do?

Who is it for?

🧑‍🔬 AI researchers tracking the ecosystem

🏢 Product teams and competitive analysts

📊 Data scientists building ML registries

🤖 Developers building AI-powered pipelines

🎓 Educators and AI course creators

Why use it?

Data extracted

How to use it

Step 1 — Open the actor

Step 2 — Configure your input

Step 3 — Run and export

Input parameters

Output example

Tips and tricks

How much does it cost to scrape Hugging Face Spaces?

Integrations

Export to Google Sheets

Airtable database of AI demos

Slack notifications on new trending spaces

LLM-powered space categorization

Vector search over space descriptions

API usage

Node.js

Python

cURL

Use with Claude and MCP

Claude Code (terminal)

Claude Desktop / Cursor / VS Code

Example prompts

Legality — Is it legal to scrape Hugging Face Spaces?

FAQ

Related scrapers

You might also like

Hugging Face Scraper - Models Datasets Spaces

HuggingFace Scraper (All-in-One) 🚀🤗🔎

HuggingFace Hub Scraper - Models, Datasets, Spaces

HuggingFace Hub Scraper

HuggingFace Hub Scraper - Models, Datasets, Spaces & Authors

Hugging Face Scraper — AI Models, Datasets, Spaces & Papers

HuggingFace Hub Scraper

AI Tools & Models Intelligence

Kaggle Dataset Scraper — Search, Metadata & Trending

Huggingface Ai Scraper

Hugging Face Datasets Catalog — ML Training Data Intel