# Hugging Face Scraper: Trending AI Models, Datasets & Spaces (`scrapemint/huggingface-ai-models-scraper`) Actor

Track trending AI models, datasets, and Spaces on Hugging Face. One row per item with downloads, likes, trending score, tags, pipeline type, and license. Search by keyword, author, or tag. No login, no API key. Pay per row.

- **URL**: https://apify.com/scrapemint/huggingface-ai-models-scraper.md
- **Developed by:** [Ken M](https://apify.com/scrapemint) (community)
- **Categories:** AI, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Hugging Face Scraper: Trending AI Models, Datasets & Spaces

Track what is trending in AI right now. This Actor pulls models, datasets, and Spaces from the Hugging Face Hub with their downloads, likes, trending score, tags, pipeline type, and license. No login, no API key, no browser. One clean row per item, ready for a spreadsheet, a dashboard, or an alert.

Sort by trending score to see what the AI community is adopting this week, or by downloads and likes for all-time rankings. Filter by keyword, author, or tag to track one niche (OCR, text to speech, a specific license) or one organization (meta-llama, google, deepseek-ai).

### What you get

One row per model, dataset, or Space, with:

- `type` (`model`, `dataset`, or `space`), `id`, `name`, `author`
- `url` (direct link to the Hub page)
- `trendingScore`, `downloads`, `likes`
- `pipelineTag` (e.g. `text-generation`, `image-text-to-text`), `libraryName`, `sdk`
- `license`, `arxivIds`, `gated`, `tags`
- `createdAt`, `lastModified`, `scrapedAt`

### Input

- `resourceTypes` (any of `models`, `datasets`, `spaces`; default `models`)
- `search` (optional keyword, e.g. `ocr`, `llama`)
- `author` (optional organization or user, e.g. `meta-llama`)
- `tags` (optional Hub tag filters, e.g. `text-generation`, `license:mit`)
- `sort` (`trendingScore`, `downloads`, `likes`, `createdAt`, `lastModified`)
- `maxRowsPerType`

### Example input

Top 50 trending models right now:

```json
{
  "resourceTypes": ["models"],
  "sort": "trendingScore",
  "maxRowsPerType": 50
}
````

Everything one lab has published, most downloaded first:

```json
{
  "resourceTypes": ["models", "datasets"],
  "author": "deepseek-ai",
  "sort": "downloads"
}
```

The text-to-speech niche, newest first:

```json
{
  "resourceTypes": ["models"],
  "tags": ["text-to-speech"],
  "sort": "createdAt",
  "maxRowsPerType": 100
}
```

### Example output

```json
{
  "type": "model",
  "id": "baidu/Unlimited-OCR",
  "name": "Unlimited-OCR",
  "author": "baidu",
  "url": "https://huggingface.co/baidu/Unlimited-OCR",
  "trendingScore": 701,
  "downloads": 758489,
  "likes": 1643,
  "pipelineTag": "image-text-to-text",
  "libraryName": "transformers",
  "license": "mit",
  "arxivIds": ["2606.23050"],
  "gated": false,
  "createdAt": "2026-06-19T09:40:33.000Z",
  "lastModified": "2026-06-28T06:20:01.000Z"
}
```

### Uses

- Weekly "what is trending in AI" reports, newsletters, and dashboards
- Competitive tracking of AI labs: what they release and how fast it is adopted
- Finding models for a task by pipeline tag, license, and real adoption numbers
- Dataset discovery for training and evaluation
- Market research on AI tooling categories (OCR, speech, agents, image generation)

### Pricing

Pay per row. The first 20 rows of every run are free so you can validate output before you scale up. You only pay for the rows you keep.

### Notes

- Data comes from the public Hugging Face Hub API. Trending score is Hugging Face's own ranking of current community momentum; it changes daily, so schedule the Actor to catch movers early.
- Spaces have no download counter; sorting Spaces by downloads falls back to likes.
- Multiple tag filters are combined with AND.

# Actor input Schema

## `resourceTypes` (type: `array`):

Which Hugging Face resource types to return: models, datasets, spaces. Pick one or more.

## `search` (type: `string`):

Optional. Keyword to search names and descriptions, e.g. ocr, llama, text to speech. Leave empty to get the top items by the chosen sort.

## `author` (type: `string`):

Optional. Restrict to one author or organization, e.g. meta-llama, google, openai.

## `tags` (type: `array`):

Optional. Filter by Hub tags, e.g. text-generation, image-classification, license:mit, arxiv:2606.23050. Multiple tags are combined with AND.

## `sort` (type: `string`):

Ranking to use. trendingScore surfaces what is hot right now; downloads and likes rank all-time popularity.

## `maxRowsPerType` (type: `integer`):

Maximum rows to return for each selected resource type. First 20 rows per run are free.

## `proxyConfiguration` (type: `object`):

Optional. The Hugging Face API is a tolerant public source, so proxy is off by default. Supply one only if you run very large pulls and hit rate limits.

## Actor input object example

```json
{
  "resourceTypes": [
    "models"
  ],
  "sort": "trendingScore",
  "maxRowsPerType": 50
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "resourceTypes": [
        "models"
    ],
    "sort": "trendingScore",
    "maxRowsPerType": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("scrapemint/huggingface-ai-models-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "resourceTypes": ["models"],
    "sort": "trendingScore",
    "maxRowsPerType": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("scrapemint/huggingface-ai-models-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "resourceTypes": [
    "models"
  ],
  "sort": "trendingScore",
  "maxRowsPerType": 50
}' |
apify call scrapemint/huggingface-ai-models-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scrapemint/huggingface-ai-models-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hugging Face Scraper: Trending AI Models, Datasets & Spaces",
        "description": "Track trending AI models, datasets, and Spaces on Hugging Face. One row per item with downloads, likes, trending score, tags, pipeline type, and license. Search by keyword, author, or tag. No login, no API key. Pay per row.",
        "version": "0.1",
        "x-build-id": "SQLLavInElB5sfHPf"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scrapemint~huggingface-ai-models-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scrapemint-huggingface-ai-models-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scrapemint~huggingface-ai-models-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scrapemint-huggingface-ai-models-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scrapemint~huggingface-ai-models-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scrapemint-huggingface-ai-models-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "resourceTypes": {
                        "title": "What to scrape",
                        "type": "array",
                        "description": "Which Hugging Face resource types to return: models, datasets, spaces. Pick one or more.",
                        "default": [
                            "models"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "search": {
                        "title": "Search keyword",
                        "type": "string",
                        "description": "Optional. Keyword to search names and descriptions, e.g. ocr, llama, text to speech. Leave empty to get the top items by the chosen sort."
                    },
                    "author": {
                        "title": "Author / organization",
                        "type": "string",
                        "description": "Optional. Restrict to one author or organization, e.g. meta-llama, google, openai."
                    },
                    "tags": {
                        "title": "Tag filters",
                        "type": "array",
                        "description": "Optional. Filter by Hub tags, e.g. text-generation, image-classification, license:mit, arxiv:2606.23050. Multiple tags are combined with AND.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "sort": {
                        "title": "Sort by",
                        "enum": [
                            "trendingScore",
                            "downloads",
                            "likes",
                            "createdAt",
                            "lastModified"
                        ],
                        "type": "string",
                        "description": "Ranking to use. trendingScore surfaces what is hot right now; downloads and likes rank all-time popularity.",
                        "default": "trendingScore"
                    },
                    "maxRowsPerType": {
                        "title": "Max rows per type",
                        "type": "integer",
                        "description": "Maximum rows to return for each selected resource type. First 20 rows per run are free.",
                        "default": 50
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional. The Hugging Face API is a tolerant public source, so proxy is off by default. Supply one only if you run very large pulls and hit rate limits."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
