# Hugging Face Models Scraper - Low-cost💲🔥🤖📌 (`delectable_incubator/hugging-face-models-scraper-low-cost`) Actor

Scrape Hugging Face model listings 🤖📊 with a powerful AI model scraper. Extract model names, creators, downloads, likes, tags, update dates, model URLs, and popularity metrics from keyword searches. Ideal for AI research, model discovery, ecosystem monitoring and machine learning datasets 🚀

- **URL**: https://apify.com/delectable\_incubator/hugging-face-models-scraper-low-cost.md
- **Developed by:** [Prime Scrape](https://apify.com/delectable_incubator) (community)
- **Categories:** AI, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.00005 / actor start

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

<p align="center">
<img src="https://i.ibb.co/jkNS73wX/readme.png" alt="Hugging Face Models Full-Text Search Scraper" width="100%">
</p>

---

## 🤖🔎 Hugging Face Models Full-Text Search Scraper | Bulk AI Model Search | Apify Actor

### 🚀 Extract Hugging Face Model Search Results in Seconds (No Code)

The **Hugging Face Models Full-Text Search Scraper (Apify Actor)** is a powerful, scalable and SEO-optimized AI research tool designed to extract **full-text search results from Hugging Face Models**.

It allows you to search model repositories in bulk, discover keyword occurrences inside model files, analyze AI ecosystems, monitor emerging architectures, and build structured machine learning intelligence datasets.

Perfect for AI researchers, ML engineers, data analysts, investors, competitive intelligence teams, and automation workflows.

---

### 🔥 Why This Hugging Face Models Scraper?

✔ Best Hugging Face Models Search Scraper on Apify

✔ Supports Bulk Keyword Search (Multi-Keyword Mode)

✔ Searches Models Only (No Datasets or Spaces)

✔ Extracts Repository-Level & File-Level Matches

✔ High-Speed Search Result Extraction

✔ Clean Structured JSON / CSV / Excel Output

✔ Perfect for AI Research & Trend Monitoring

✔ No Coding Required

---

### 🎯 What This Scraper Does

This Apify Actor automatically searches Hugging Face Models using one or multiple keywords and extracts detailed match information from model repositories.

#### 📌 Core Features

✅ Search Hugging Face Models only

✅ Bulk keyword processing

✅ Multi-topic AI research automation

✅ Extract repository-level search matches

✅ Extract file-level search matches

✅ Extract code snippets containing keywords

✅ Extract model tags & metadata

✅ Automatic pagination handling

✅ Structured dataset generation

✅ High-speed extraction engine

---

### ⚡ Input Configuration (Simple & Powerful)

#### 🔥 BULK KEYWORD MODE (SEO BOOST 🚀)

````

{
"keywords": \[
"bert",
"llama",
"stable diffusion",
"mistral",
"transformers",
"vision language model",
"rag",
"quantization",
"fine tuning",
"multimodal"
],
"maxItemsPerKeyword": 100
}

```

---

### 📊 Extracted Data

| Field       | Description                      |
| ----------- | -------------------------------- |
| owner       | Model repository owner           |
| repoName    | Model repository name            |
| repoHref    | Repository path                  |
| repoFullUrl | Full repository URL              |
| fileName    | Matched file name                |
| fileHref    | File path                        |
| fileFullUrl | Full file URL                    |
| matchCount  | Number of matches                |
| tags        | Parsed model tags                |
| tagsRaw     | Raw tag data                     |
| codeSnippet | Matching text snippet            |
| keyword     | Search keyword                   |
| sourceUrl   | Original Hugging Face search URL |

---

### 💡 Use Cases (High Demand SEO Keywords)

This Hugging Face scraper is ideal for:

🤖 AI model research

📊 Machine learning intelligence

🔎 Open-source AI monitoring

📈 AI trend analysis

🧠 LLM ecosystem discovery

⚡ Hugging Face repository search

📚 Model documentation mining

🏢 Competitive AI intelligence

🔬 Architecture research

📡 AI keyword monitoring

📊 Dataset enrichment

🤖 AI training datasets

---

### 🚀 Key Features (Apify SEO Optimized)

⚡ Bulk keyword search support

🤖 Hugging Face model search automation

📌 Repository-level intelligence

📄 File-level keyword extraction

🔍 Code snippet extraction

📊 Structured output datasets

🌍 Large-scale AI ecosystem coverage

💾 Export-ready data

⚙️ Scalable cloud execution

🔁 Automated pagination

---

### 📤 Output Formats Supported

✔ JSON

✔ CSV

✔ Excel (XLSX)

✔ XML

✔ HTML

---

### 📦 Example Output

```

{
"owner": "google-bert",
"repoName": "bert-base-uncased",
"repoFullUrl": "https://huggingface.co/google-bert/bert-base-uncased",
"fileName": "README.md",
"fileFullUrl": "https://huggingface.co/google-bert/bert-base-uncased/blob/main/README.md?code=true",
"matchCount": "40 matches",
"tags": \[
"transformers",
"pytorch",
"onnx",
"bert"
],
"codeSnippet": "## BERT base model (uncased)",
"keyword": "bert",
"sourceUrl": "https://huggingface.co/search/full-text?q=bert\&type=model"
}

````

---

### 📊 Preconfigured Dataset Views

#### 🔹 Overview View

Clean table including:

• Owner

• Model Name

• Match Count

• Keyword

• Model URL

• Matched File URL

Perfect for fast AI model discovery.

#### 🔹 Detailed View

Extended dataset including:

• Repository paths

• File paths

• Match counts

• Model tags

• Raw metadata

• Code snippets

• Search source URLs

Ideal for:

🤖 AI research

🔎 Repository analysis

📈 Trend monitoring

📊 ML ecosystem intelligence

#### 🔹 By Keyword View

Grouped by keyword:

• Keyword

• Owner

• Repository

• Match Count

• Model URL

Perfect for comparing AI topics at scale.

---

### 🌍 Why Use This Scraper?

📊 AI Ecosystem Intelligence

🤖 LLM & Foundation Model Research

🔎 Full-Text Repository Search Automation

📈 AI Trend Monitoring

🧠 Open Source AI Discovery

⚡ Large-Scale Keyword Analysis

📚 Repository Documentation Mining

🤖 Automation Ready

---

### 🔥 Why This is the BEST Hugging Face Models Search Scraper on Apify?

✔ Optimized for Apify marketplace ranking

✔ Bulk keyword support

✔ Repository-level intelligence

✔ File-level match extraction

✔ Code snippet extraction

✔ Structured export-ready output

✔ Enterprise-ready scalability

✔ Built for AI research workflows

---

### 💸 Pricing

This scraper runs on a **pay-per-result pricing model**.

You only pay for successfully extracted records.

💳 **Price:** $0.98 / 1,000 results

---

### ❓ FAQ (SEO BOOST SECTION)

#### Can I search multiple keywords at once?

Yes — bulk keyword mode is fully supported.

#### Does it search Datasets and Spaces?

No — this actor searches Hugging Face Models only.

#### Can I extract code snippets?

Yes — matching snippets are extracted whenever available.

#### Can I analyze AI trends?

Yes — this scraper is designed for AI ecosystem intelligence and trend monitoring.

#### Is coding required?

No — 100% no-code Apify Actor.

#### Can I export data?

Yes — JSON, CSV, Excel, XML and HTML are supported.

---

### ⚠️ Disclaimer

This tool is an independent automation solution and is not affiliated with, endorsed by, or sponsored by Hugging Face.

---

### 🔗 Related Actors (AI & Developer Intelligence Suite)

We are building a full PrimeScrape AI Intelligence Suite:

👉 More AI, Developer, Repository & Research Scrapers Coming Soon 🚀

---

### 🌍 PrimeScrape Ecosystem

Built for large-scale:

🤖 AI Intelligence

📊 Research Automation

🔎 Repository Discovery

📈 Trend Monitoring

🧠 Machine Learning Research

⚙️ Workflow Automation

💾 Structured Data Collection

---

### 📬 Support

⭐⭐⭐⭐⭐ Leave a review if you like this scraper.

📩 Contact us for custom scraping solutions, enterprise automation, or AI intelligence projects.

# Actor input Schema

## `keywords` (type: `array`):

One or more keywords to search for models on HuggingFace. Each keyword is scraped independently. Examples:  'stable-diffusion'.
## `maxItemsPerKeyword` (type: `integer`):

Maximum number of model results to collect per keyword

## Actor input object example

```json
{
  "keywords": [
    "gpt",
    "mistral"
  ],
  "maxItemsPerKeyword": 60
}
````

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keywords": [
        "gpt",
        "mistral"
    ],
    "maxItemsPerKeyword": 60
};

// Run the Actor and wait for it to finish
const run = await client.actor("delectable_incubator/hugging-face-models-scraper-low-cost").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "keywords": [
        "gpt",
        "mistral",
    ],
    "maxItemsPerKeyword": 60,
}

# Run the Actor and wait for it to finish
run = client.actor("delectable_incubator/hugging-face-models-scraper-low-cost").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keywords": [
    "gpt",
    "mistral"
  ],
  "maxItemsPerKeyword": 60
}' |
apify call delectable_incubator/hugging-face-models-scraper-low-cost --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=delectable_incubator/hugging-face-models-scraper-low-cost",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hugging Face Models Scraper - Low-cost💲🔥🤖📌",
        "description": "Scrape Hugging Face model listings 🤖📊 with a powerful AI model scraper. Extract model names, creators, downloads, likes, tags, update dates, model URLs, and popularity metrics from keyword searches. Ideal for AI research, model discovery, ecosystem monitoring and machine learning datasets 🚀",
        "version": "0.0",
        "x-build-id": "lxx0ddhmbRA380Jhg"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/delectable_incubator~hugging-face-models-scraper-low-cost/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-delectable_incubator-hugging-face-models-scraper-low-cost",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/delectable_incubator~hugging-face-models-scraper-low-cost/runs": {
            "post": {
                "operationId": "runs-sync-delectable_incubator-hugging-face-models-scraper-low-cost",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/delectable_incubator~hugging-face-models-scraper-low-cost/run-sync": {
            "post": {
                "operationId": "run-sync-delectable_incubator-hugging-face-models-scraper-low-cost",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "keywords"
                ],
                "properties": {
                    "keywords": {
                        "title": "Search Keywords",
                        "type": "array",
                        "description": "One or more keywords to search for models on HuggingFace. Each keyword is scraped independently. Examples:  'stable-diffusion'.",
                        "items": {
                            "type": "string"
                        },
                        "default": [
                            "gpt",
                            "mistral"
                        ]
                    },
                    "maxItemsPerKeyword": {
                        "title": "Max Items per Keyword",
                        "type": "integer",
                        "description": "Maximum number of model results to collect per keyword",
                        "default": 60
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
