HuggingFace Scraper (All-in-One) ๐Ÿš€๐Ÿค—๐Ÿ”Ž avatar

HuggingFace Scraper (All-in-One) ๐Ÿš€๐Ÿค—๐Ÿ”Ž

Pricing

$24.99/month + usage

Go to Apify Store
HuggingFace Scraper (All-in-One) ๐Ÿš€๐Ÿค—๐Ÿ”Ž

HuggingFace Scraper (All-in-One) ๐Ÿš€๐Ÿค—๐Ÿ”Ž

๐ŸŸ  Easily collect Models, Datasets & Spaces from Hugging Face Provide one or multiple search keywords and extract data across the entire HuggingFace ecosystem including Repository name ๐Ÿ‘ค Owner ๐Ÿ”— Source search URL & moreโ€ฆ Perfect for AI architecture research & full ecosystem intelligence ๐Ÿš€๐Ÿค–

Pricing

$24.99/month + usage

Rating

5.0

(1)

Developer

Storm_Scraper

Storm_Scraper

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

16 hours ago

Last modified

Categories

Share

HuggingFace All-in-One Full-Text Search Scraper ๐Ÿš€๐Ÿค—๐Ÿ”Ž

The HuggingFace All-in-One Full-Text Search Scraper is a powerful automation tool designed to extract Models, Datasets, and Spaces directly from full-text search results on Hugging Face.

Whether you're researching AI architectures, tracking dataset usage, analyzing open-source ML tools, or monitoring ecosystem trends โ€” this scraper helps you collect structured, keyword-level intelligence across the entire HuggingFace platform.


๐ŸŒŸ What Makes It โ€œAll-in-Oneโ€?

Unlike single-type scrapers, this actor lets you:

โœ… Search Models

โœ… Search Datasets

โœ… Search Spaces

โœ… Combine all three in one run

โœ… Process multiple keywords independently

โœ… Respect maxItemsPerKeyword limits

โœ… Extract repository-level + file-level match data

โœ… Access pre-configured dataset views instantly


๐Ÿ“Š Data Extracted

๐Ÿ”น Core Fields

FieldDescription
contentTypeModels / Datasets / Spaces
ownerRepository owner
repoNameRepository name
matchCountNumber of keyword matches
keywordSearch keyword used
repoFullUrlFull repository URL
fileFullUrlURL of matched file

๐Ÿ”น Extended Fields

FieldDescription
repoHrefRelative repository path
fileNameFile containing match
fileHrefFile relative path
tagsParsed tags
tagsRawRaw tags string
codeSnippetExtracted matching snippet
searchTypesSelected content types filter
sourceUrlOriginal search results URL

๐Ÿ›  How to Use

Simply:

1๏ธโƒฃ Deploy the Actor on Apify

2๏ธโƒฃ Provide one or more keywords (e.g., bert, llama, stable-diffusion)

3๏ธโƒฃ Select content types:

Models

Datasets

Spaces

4๏ธโƒฃ Set maxItemsPerKeyword

5๏ธโƒฃ Run the scraper

6๏ธโƒฃ Export your results in:

โœ… JSON

โœ… CSV

โœ… Excel

โœ… XML

โœ… HTML


๐Ÿ’ธ Pricing

This scraper runs on a monthly subscription model.

You only pay for successful runs.

๐Ÿ’ณ Price: $24.99 / month


If you're interested in other Rutube, E-commerce, Events, Real Estate, Jobs, Company Leads, YouTube or Facebook scraping solutions, check out these related tools:


You can even apply sentiment analysis on the data text we've extracted! ๐Ÿ˜ƒ๐Ÿ“Š:

โš™๏ธ Input Configuration

๐Ÿ“ฅ Input Example

{
"keywords": ["bert", "llama"],
"searchTypes": ["Models", "Datasets", "Spaces"],
"maxItemsPerKeyword": 60
}

Input Fields

FieldTypeDescription
keywordsArrayOne or more search keywords (required). Each keyword is processed independently.
searchTypesArraySelect Models, Datasets, Spaces โ€” or combine them freely.
maxItemsPerKeywordIntegerMaximum results per keyword (across all selected types combined).

๐Ÿ“ค Output Example

{
"contentType": "dataset",
"owner": "Giannis79",
"repoName": "BERT_Journalism_Sentiment",
"repoHref": "/datasets/Giannis79/BERT_Journalism_Sentiment",
"repoFullUrl": "https://huggingface.co/datasets/Giannis79/BERT_Journalism_Sentiment",
"fileName": "README.md",
"fileHref": "/datasets/Giannis79/BERT_Journalism_Sentiment/blob/main/README.md?code=true",
"fileFullUrl": "https://huggingface.co/datasets/Giannis79/BERT_Journalism_Sentiment/blob/main/README.md?code=true",
"matchCount": "12 matches",
"tags": [
"region:us"
],
"tagsRaw": "tags: region:us",
"codeSnippet": "BERT Model Sentiment Analysis\nProject Overview\nThis repository contains scripts and resources for performing sentiment analysis on news articles referring to Russinan-Ukrainian 2022 War using a pre-trained BERT model. The goal is to classify the sentiment of each article as either Pro-Russian or Pro-Ukrainian and calculate a sentiment score.",
"keyword": "bert",
"searchTypes": [
"Datasets",
"Spaces"
],
"sourceUrl": "https://huggingface.co/search/full-text?q=bert&type=dataset&type=space"
}

๐Ÿ“Š Preconfigured Dataset Views

The actor automatically generates structured dataset views:

๐Ÿ”น Overview

Clean comparison table including:

Type

Owner

Name

Match Count

Keyword

Page URL

Matched File

Perfect for high-level ecosystem analysis.

๐Ÿ”น Detailed

Extended table including:

Repository paths

File-level matches

Tags (parsed + raw)

Code snippets

Search URL

Ideal for:

๐Ÿค– AI architecture research

๐Ÿ”Ž Code-level keyword discovery

๐Ÿ“Š Dataset trend analysis

๐Ÿง  Model ecosystem intelligence

๐Ÿ”น By Keyword

Grouped by keyword across all selected content types. Perfect for comparing topic coverage.

๐Ÿ”น By Type

Grouped by Models / Datasets / Spaces.

Perfect for understanding distribution across the platform.


๐ŸŒ Why Use This Scraper?

๐Ÿ“Š Full HuggingFace Ecosystem Intelligence

๐Ÿ”Ž Full-Text Repository Search Automation

๐Ÿค– Model + Dataset + App (Spaces) Discovery

๐Ÿ“ˆ Technical Trend Monitoring

โšก Scalable โ€” from niche scans to large-scale AI research

๐Ÿ” Automation-Ready โ€” schedule recurring monitoring


Disclaimer

This scraper is an independent automation tool and is not affiliated with, endorsed by, or sponsored by Hugging Face.


๐Ÿ“ซ Support & Contact

๐Ÿ˜Š Leave a 5-star rating โญโญโญโญโญ if youโ€™re satisfied

๐ŸŒช๏ธ Storm Scraper https://apify.com/scrapestorm

For questions, feature requests, or custom scraping solutions, contact us directly via Apify or email.