# Hugging Face Spaces Scraper (`parseforge/huggingface-spaces-scraper`) Actor

Query the Hugging Face Spaces catalog by keyword, author, SDK, and sort order. Records include id, author, SDK, likes, trending score, runtime, hardware, license, tags, created date, and Space URL. Handy for AI model discovery, demo curation, and trend reporting.

- **URL**: https://apify.com/parseforge/huggingface-spaces-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** AI, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $7.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🤗 Hugging Face Spaces Public API Scraper

> 🚀 **Export Hugging Face Spaces metadata in seconds. ID, author, SDK, likes, trending score, runtime, hardware, license, and tags, straight from the public huggingface.co/api/spaces endpoint.**

> 🕒 **Last updated:** 2026-06-05 · **📊 14 fields** per record · Full Hugging Face Spaces catalogue · Public API, no key required

The Hugging Face Spaces Public API Scraper turns the [huggingface.co/api/spaces](https://huggingface.co/api/spaces) public REST endpoint into a clean dataset. It calls the API with whatever sort, filter, and search parameters you supply, then flattens each Space into one tidy row.

Coverage spans the full Hugging Face Spaces catalogue, every public Space, every SDK (Gradio, Streamlit, Docker, static). Each row carries id, author, SDK, like count, trending score, runtime stage, hardware tier, creation and last-modified timestamps, tags, and license.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| 🤖 ML engineers | Discover top demo apps for inspiration |
| 📊 Researchers | Track Spaces growth over time |
| 🏢 Platform teams | Build an internal Spaces directory |
| 🧑‍🎓 Educators | Curate teaching demos |
| 📰 AI journalists | Source trending Spaces stories |
| 👩‍💻 Developers | Mirror Spaces data into your own DB |

### 📋 What the Hugging Face Spaces Public API Scraper does

- Calls `/api/spaces` with your chosen sort, search, author, and SDK filters.
- Flattens nested runtime and card data into top-level fields.
- Casts numbers to real numbers so they import cleanly into spreadsheets.
- Surfaces upstream errors as a single clean error record.
- Exports as CSV, Excel, JSON, JSONL, XML, RSS, or HTML.

> 💡 **Why it matters:** The Hugging Face API is open, but its responses nest runtime and card data several levels deep. This actor flattens everything into a single row per Space, ready for analytics.

### 🎬 Full Demo

_🚧 Coming soon._

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
<tr><td><code>search</code></td><td>string</td><td>No</td><td>Free-text search across Spaces.</td></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td>No</td><td>Free users 10, paid up to 1,000,000. Prefill is 10.</td></tr>
<tr><td><code>sort</code></td><td>enum</td><td>No</td><td>likes, trending, createdAt, or lastModified.</td></tr>
<tr><td><code>direction</code></td><td>enum</td><td>No</td><td>Descending or ascending.</td></tr>
<tr><td><code>author</code></td><td>string</td><td>No</td><td>Hugging Face author or organization.</td></tr>
<tr><td><code>sdk</code></td><td>enum</td><td>No</td><td>gradio, streamlit, docker, or static.</td></tr>
</table>

**Example 1, top liked Spaces:**
```json
{ "sort": "likes", "direction": "-1", "maxItems": 50 }
````

**Example 2, trending Gradio demos:**

```json
{ "sort": "trending", "sdk": "gradio", "maxItems": 100 }
```

> ⚠️ **Good to Know:** The Hugging Face API is public and free. No API key is needed. Rate limits are generous but exist, so prefer larger sort-based pulls over many small ones.

### 📊 Output

| Field | Type | Description |
|---|---|---|
| 🆔 `id` | string | Owner/space slug. |
| 👤 `author` | string | Owner of the Space. |
| 🧰 `sdk` | string | Gradio, Streamlit, Docker, or static. |
| ❤️ `likes` | number | Like count. |
| 🔥 `trendingScore` | number | Hugging Face trending score. |
| 📅 `createdAt` | string | Creation timestamp. |
| 🕒 `lastModified` | string | Last modification timestamp. |
| 🏷️ `tags` | array | Tags array. |
| 🚦 `runtime` | string | Runtime stage. |
| 🖥️ `hardware` | string | Hardware tier. |
| 🔒 `private` | boolean | Private flag. |
| 📜 `license` | string | License string from the Space card. |
| 🔗 `url` | string | Direct link. |
| 🕒 `scrapedAt` | string | When fetched. |
| ❌ `error` | string | Set if the upstream response was an error. |

**Sample record:**

```json
{
  "id": "stabilityai/stable-diffusion",
  "author": "stabilityai",
  "sdk": "gradio",
  "likes": 12340,
  "trendingScore": 87.4,
  "createdAt": "2022-08-22T12:00:00.000Z",
  "lastModified": "2026-05-20T08:14:55.000Z",
  "tags": ["text-to-image","diffusers"],
  "runtime": "RUNNING",
  "hardware": "a10g-small",
  "private": false,
  "license": "creativeml-openrail-m",
  "url": "https://huggingface.co/spaces/stabilityai/stable-diffusion",
  "scrapedAt": "2026-06-05T12:00:00.000Z",
  "error": null
}
```

### ✨ Why choose this Actor

| 🆓 | Public Hugging Face API, no key needed. |
| 🧹 | Flattens nested runtime and card data into one row. |
| 🔢 | Casts numbers for clean Excel and pandas imports. |
| 🛟 | Surfaces upstream errors as clean rows. |
| 🔌 | Sort, search, author, and SDK filters exposed. |
| 💾 | Push to dataset for CSV, Excel, JSON, XML, or RSS export. |

### 📈 How it compares to alternatives

| Approach | Setup | Pagination | Flattening | Export formats |
|---|---|---|---|---|
| Raw `curl` | 5 min | manual | none | manual |
| `huggingface_hub` python | 15 min install | yes | partial | code |
| **This Actor** | 5 seconds | yes | yes | 7 formats |

### 🚀 How to use

1. Click **Try for free**.
2. Pick a sort and optional filters.
3. Click **Start**. Your dataset is ready in seconds.

### 💼 Business use cases

**🤖 Demo discovery.** Find top Gradio demos for your category to benchmark UX.

**📊 Catalogue analytics.** Track Spaces growth, license distribution, hardware usage.

**🏢 Internal directories.** Mirror Spaces data into your team wiki for shared discovery.

**📰 AI market journalism.** Build trending Spaces datasets for a feature.

### 🔌 Automating Hugging Face Spaces Public API Scraper

- **Make / Zapier**: schedule a daily run.
- **Cron schedule**: native Apify scheduler.
- **Webhooks**: POST on completion.
- **Warehouse pipe**: native integrations move datasets straight into BigQuery, Snowflake, or Postgres.

### 🌟 Beyond business use cases

**🎓 Education.** Curate teaching demos.

**🧪 Personal research.** Discover what people are building.

**🤝 Non-profit and open data.** Track open-source AI activity.

**🧰 Tinkering and prototyping.** Seed a leaderboard or directory site.

### 🤖 Ask an AI assistant about this scraper

Drop this README into ChatGPT, Claude, or any AI assistant and ask it to design a Spaces analytics pipeline. The input fields, schema, and examples above contain everything an LLM needs.

### ❓ Frequently Asked Questions

**❓ API key needed?** No.

**❓ How many Spaces?** Hundreds of thousands, growing daily.

**❓ Filter by SDK?** Yes, gradio, streamlit, docker, static.

**❓ Filter by author?** Yes.

**❓ Sort options?** Likes, trending, createdAt, lastModified.

**❓ Rate limits?** Generous public limits.

**❓ Excel export?** Yes, via the Apify dataset UI.

**❓ Schema stability?** Core fields are stable.

**❓ Scheduling?** Yes, via Apify scheduler.

**❓ Public data only?** Yes.

### 🔌 Integrate with any app

Apify ships native integrations with Make, Zapier, Slack, Discord, Google Drive, Google Sheets, Gmail, Airbyte, Keboola, Telegram, GitHub, and any REST API or webhook endpoint. Trigger runs from a calendar event, a form submission, a cron job, or pipe results straight into BigQuery, Snowflake, or a Postgres warehouse.

### 🔗 Recommended Actors

| Actor | What it does |
|---|---|
| [ParseForge Hugging Face Collections Scraper](https://apify.com/parseforge/huggingface-collections-scraper) | Public Hugging Face collections metadata. |
| [ParseForge Hugging Face Discussions Scraper](https://apify.com/parseforge/huggingface-discussions-scraper) | Discussion threads and PRs on Hugging Face repos. |
| [ParseForge ModelScope Models Scraper](https://apify.com/parseforge/modelscope-models-scraper) | ModelScope public models. |
| [ParseForge Civitai Models Scraper](https://apify.com/parseforge/civitai-models-scraper) | Civitai public models. |

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for 900+ production-grade scrapers across business intelligence, real estate, e-commerce, sports, finance, and public records.

***

**Disclaimer.** This actor scrapes only publicly available data. ParseForge is not affiliated with, endorsed by, or sponsored by any of the third-party services referenced. Users are responsible for complying with the target site's terms of service and applicable law. [Create a free account w/ $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp).

# Actor input Schema

## `search` (type: `string`):

Free-text search across Spaces titles and descriptions. Trimmed before being sent to the API.

## `maxItems` (type: `integer`):

Free users are limited to 10 items (preview). Paid users can collect up to 1,000,000 items.

## `sort` (type: `string`):

Sort order for the API.

## `direction` (type: `string`):

Sort direction.

## `author` (type: `string`):

Filter by Hugging Face author or organization (e.g. huggingface, openai-community).

## `sdk` (type: `string`):

Filter by the Space SDK.

## Actor input object example

```json
{
  "maxItems": 10,
  "sort": "likes",
  "direction": "-1"
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/huggingface-spaces-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "maxItems": 10 }

# Run the Actor and wait for it to finish
run = client.actor("parseforge/huggingface-spaces-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10
}' |
apify call parseforge/huggingface-spaces-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/huggingface-spaces-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hugging Face Spaces Scraper",
        "description": "Query the Hugging Face Spaces catalog by keyword, author, SDK, and sort order. Records include id, author, SDK, likes, trending score, runtime, hardware, license, tags, created date, and Space URL. Handy for AI model discovery, demo curation, and trend reporting.",
        "version": "0.1",
        "x-build-id": "U5CoiK2X60M3C6CyS"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~huggingface-spaces-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-huggingface-spaces-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~huggingface-spaces-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-huggingface-spaces-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~huggingface-spaces-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-huggingface-spaces-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "search": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Free-text search across Spaces titles and descriptions. Trimmed before being sent to the API."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users are limited to 10 items (preview). Paid users can collect up to 1,000,000 items."
                    },
                    "sort": {
                        "title": "Sort by",
                        "enum": [
                            "likes",
                            "trending",
                            "createdAt",
                            "lastModified"
                        ],
                        "type": "string",
                        "description": "Sort order for the API.",
                        "default": "likes"
                    },
                    "direction": {
                        "title": "Direction",
                        "enum": [
                            "-1",
                            "1"
                        ],
                        "type": "string",
                        "description": "Sort direction.",
                        "default": "-1"
                    },
                    "author": {
                        "title": "Author",
                        "type": "string",
                        "description": "Filter by Hugging Face author or organization (e.g. huggingface, openai-community)."
                    },
                    "sdk": {
                        "title": "SDK",
                        "enum": [
                            "gradio",
                            "streamlit",
                            "docker",
                            "static"
                        ],
                        "type": "string",
                        "description": "Filter by the Space SDK."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
