Pricing

from $2.00 / 1,000 paper scrapeds

HuggingFace Daily Papers Scraper

Scrapes AI/ML research papers from HuggingFace Daily Papers (huggingface.co/papers). Extracts title, authors, abstract, GitHub repo, star count, upvotes, AI summary, and keywords.

Pricing

from $2.00 / 1,000 paper scrapeds

Rating

0.0

(0)

Developer

tzmyk

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

What it does

Scrapes today's trending papers or papers from a specific date range
Extracts full abstracts, GitHub repo URLs, star counts, upvote counts
Includes HuggingFace's AI-generated summary and keywords for each paper
Supports both fast mode (list-only) and full detail mode (with abstract + AI metadata)

Use cases

RAG / LLM data pipelines — feed fresh research papers into your vector database daily
AI trend monitoring — track which topics are trending in the research community
Competitive intelligence — monitor GitHub repos and star growth of new papers
Research assistants — power AI agents with up-to-date academic content
Newsletter automation — curate weekly AI research digests automatically

Input

Field	Type	Default	Description
`startDate`	string	—	Fetch papers from this date (YYYY-MM-DD). Leave empty for today's trending papers.
`endDate`	string	—	Fetch papers up to this date. Defaults to `startDate`.
`maxPapers`	integer	50	Maximum number of papers to scrape (1–500).
`includeFullDetail`	boolean	true	Fetch each paper's detail page for abstract, AI summary, keywords, and upvotes.

Example inputs

Today's trending papers (fast mode):

{
  "maxPapers": 50,
  "includeFullDetail": false
}

Full detail for a specific date:

{
  "startDate": "2026-03-20",
  "includeFullDetail": true
}

Date range:

{
  "startDate": "2026-03-18",
  "endDate": "2026-03-20",
  "maxPapers": 100
}

Output

Each paper is saved as a dataset item with the following fields:

{
  "id": "2603.19235",
  "title": "Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding",
  "publishedAt": "2026-03-19T17:59:58.000Z",
  "summary": "While Multimodal Large Language Models demonstrate impressive semantic capabilities...",
  "upvotes": 77,
  "githubRepo": "https://github.com/H-EmbodVis/VEGA-3D",
  "githubStars": 109,
  "authors": ["Xianjin Wu", "Dingkang Liang", "Tianrui Feng"],
  "arxivUrl": "https://arxiv.org/abs/2603.19235",
  "paperUrl": "https://huggingface.co/papers/2603.19235",
  "aiSummary": "A video diffusion model is repurposed as a latent world simulator...",
  "aiKeywords": ["multimodal large language models", "3D structural priors", "video diffusion model"],
  "scrapedAt": "2026-03-22T01:59:38.919Z"
}

Features

No bot protection issues — HuggingFace serves clean HTML with no Cloudflare or CAPTCHA
Structured JSON extraction — data parsed directly from Svelte hydration payloads for reliability
Deduplication — papers are deduplicated across date ranges
Graceful error handling — individual paper failures are logged and skipped without stopping the run

Notes

includeFullDetail: false is significantly faster (1 list page vs. 1 list + N detail pages)
HuggingFace typically publishes 20–50 papers per day
Papers older than ~2 weeks may not appear on the date archive pages

Support

Found a bug or have a feature request? Leave a review or reach out via the Apify platform.

Ai-ML-scraper

labrat011/ai-ml-scraper

Search AI/ML models, research papers, and trending papers from HuggingFace Hub and arXiv. No API key required.

mick_

HuggingFace Papers Scraper

dadhalfdev/huggingface-papers-scraper

Scrape trending HuggingFace Papers by day, week, or month. Get titles, dates, submitters, organizations, upvotes, abstracts, summaries, PDFs, project links, and agent-ready commands for AI agents, RAG pipelines, research monitoring, and automation.

Marco Rodrigues

Huggingface Scraper

fortuitous_pirate/huggingface-scraper

Huggingface Scraper. Structured data export for lead generation, enrichment, and competitive research.

Fortuitous Pirate

Huggingface Models Scraper

klondikeking/huggingface-models-scraper

Pierrick McD0nald

AI Research Radar — compliant feed of new AI papers and news

topsail/compliant-ai-research-radar

AI research feed of new ML papers and AI news from HuggingFace, Anthropic, Google, The Decoder — structured JSON, robots-compliant.

Connor Teskey

Hugging Face Papers Scraper

parseforge/huggingface-papers-scraper

Scrape AI and machine learning research papers from Hugging Face Papers. Get titles, abstracts, authors with affiliations, upvotes, publication dates, ArXiv IDs, and community discussion counts. Search by keyword or browse daily papers.

ParseForge

HuggingFace Models Scraper

tzmyk/huggingface-models-scraper

Scrapes AI/ML models from HuggingFace (huggingface.co/models) via the official API. Extracts model ID, downloads, likes, task type, library, tags, and more. Supports search, author/org filter, pipeline tag filter, and sort order.

tzmyk

Papers with Code Scraper

crawlerbros/papers-with-code-scraper

Scrape Papers with Code like search ML papers, fetch paper details with repos and results, browse ML tasks and leaderboards, search datasets, and find ML methods.

Crawler Bros

arXiv Papers Scraper Pro — Research Papers, Authors, Citations

diverse_venture/arxiv-papers-scraper

Search and scrape arXiv research papers. Returns titles, abstracts, authors, categories, DOIs, and PDF download links. Filter by keywords (cat:cs.LG, all:transformer, au:author_name). Up to 500 papers per run. No auth required. Ideal for AI researchers and academic data mining.

Chak Man Fung

arXiv Research Papers Tracker

wsgcjj/arxiv-papers-scraper

Search and extract academic papers from arXiv by category, keyword, date range. Returns paper title, authors, abstract, categories, published date, PDF URL. Ideal for AI/ML research monitoring and training data collection.