Hugging Face Papers Scraper avatar

Hugging Face Papers Scraper

Pricing

from $9.00 / 1,000 results

Go to Apify Store
Hugging Face Papers Scraper

Hugging Face Papers Scraper

Scrape AI and machine learning research papers from Hugging Face Papers. Get titles, abstracts, authors with affiliations, upvotes, publication dates, ArXiv IDs, and community discussion counts. Search by keyword or browse daily papers.

Pricing

from $9.00 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

ParseForge Banner

๐Ÿ“„ Hugging Face Papers Scraper

AI research moves fast. Hugging Face Papers curates the most discussed machine learning papers every day with community upvotes, author handles, and linked code repos. This tool pulls that curated feed plus search results into a structured dataset you can feed into a newsletter, research tracker, or literature review.

The Hugging Face Papers Scraper collects AI/ML research papers with titles, authors, abstracts, arXiv IDs, upvotes, GitHub repos, thumbnails, and keywords. Search by topic or grab the daily trending list.

โœจ What Does It Do

  • ๐Ÿ“š Paper metadata - titles, abstracts, arXiv IDs, publication dates, and Hugging Face URLs
  • ๐Ÿ‘ฅ Author details - full author list with Hugging Face handles and verification status
  • โญ Community signals - upvotes, comment counts, and thumbnails
  • ๐Ÿ’ป Code + project links - GitHub repository URLs and project pages when authors link them
  • ๐Ÿท๏ธ AI keywords and summaries - auto-generated keywords and condensed summaries where available
  • ๐Ÿ” Two modes - search by keyword or pull today's trending feed

๐Ÿ”ง Input

  • Search Query - keyword to match paper titles and abstracts (e.g. transformer, diffusion model, LLM)
  • Mode - search for keyword search or trending for daily curated papers
  • Max Items - free users get 10 papers, paid users up to 1,000,000
{
"searchQuery": "diffusion model",
"mode": "search",
"maxItems": 100
}

๐Ÿ“Š Output

Each paper record contains 15+ fields. Download as JSON, CSV, or Excel.

๐Ÿ“Œ Field๐Ÿ“„ Description
๐Ÿ†” arxivIdarXiv paper identifier
๐Ÿ“‹ titlePaper title
๐Ÿ”— urlHugging Face Papers page
๐Ÿ”— arxivUrlarXiv abstract page
๐Ÿ“… publishedAtPublication date
โฌ†๏ธ upvotesCommunity upvote count
๐Ÿ’ฌ numCommentsDiscussion comment count
๐Ÿ‘ฅ authorsArray of authors with names and HF handles
๐Ÿ“ summaryPaper abstract
๐Ÿ–ผ๏ธ thumbnailPaper preview image
๐Ÿ’ป githubRepoLinked code repository
๐ŸŒ projectPageLinked project website
๐Ÿท๏ธ aiKeywordsAuto-generated topic keywords
{
"arxivId": "2404.12345",
"title": "Efficient Attention for Long-Context Language Models",
"url": "https://huggingface.co/papers/2404.12345",
"arxivUrl": "https://arxiv.org/abs/2404.12345",
"publishedAt": "2026-04-09",
"upvotes": 187,
"numComments": 12,
"numAuthors": 6,
"firstAuthor": "Jane Smith",
"authors": [
{ "name": "Jane Smith", "hfUser": "jsmith", "verified": true }
],
"summary": "We introduce a novel attention mechanism...",
"githubRepo": "https://github.com/example/long-attention",
"projectPage": "https://example.github.io/long-attention",
"aiKeywords": ["attention", "long-context", "efficiency"],
"scrapedAt": "2026-04-10T12:00:00.000Z"
}

๐Ÿ’Ž Why Choose the Hugging Face Papers Scraper?

FeatureOur ToolManual Browsing
Daily trending feedโœ… Yesโœ… Yes
Keyword searchโœ… Yesโš ๏ธ Limited UI
Bulk exportโœ… Up to 1M papersโŒ One at a time
Author handlesโœ… Includedโš ๏ธ Click each profile
Linked code + projectโœ… Extractedโš ๏ธ Scroll through page
Scheduled monitoringโœ… Daily runsโŒ Not possible

๐Ÿ“‹ How to Use

  1. Sign Up - Create a free account w/ $5 credit
  2. Configure - pick a keyword or trending mode and set your max items
  3. Run It - click Start and get curated AI papers in seconds

No coding, no daily manual browsing.

๐ŸŽฏ Business Use Cases

  • ๐Ÿ“ฌ Research newsletters - auto-curate a weekly digest of the hottest ML papers
  • ๐Ÿง  AI labs - build an internal literature tracker for new diffusion, LLM, or RL work
  • ๐ŸŽ“ PhD students - monitor new papers in your subfield without daily site visits
  • ๐Ÿ“Š Trend analysis - track which topics are gaining community upvotes over time
  • ๐Ÿ’ผ Recruiters - spot up-and-coming researchers by watching trending author handles
  • ๐Ÿ’ป Dev tool makers - find papers with open code to feature in your product

โ“ FAQ

๐Ÿค– What is Hugging Face Papers? Hugging Face Papers is a curated feed of AI/ML research papers with community voting, author profiles, and links to code repos and project pages.

๐Ÿ” What's the difference between search and trending mode? Trending pulls today's curated daily papers chosen by the Hugging Face team and community. Search runs a keyword query across indexed papers.

โญ What does "upvotes" mean? Upvotes are community signals from Hugging Face users indicating which papers they think are most worth reading.

๐Ÿ’ป Are GitHub repos always available? Only when the paper's authors or the community have linked them. Many papers include code, but not all.

๐Ÿ” Can I run this daily? Yes. Set up a scheduled run in trending mode to keep a daily archive of the most discussed ML work.

๐Ÿ”— Integrate Hugging Face Papers Scraper with any app

  • Make - automate paper digest generation
  • Zapier - push new papers to your reading list
  • Slack - post daily AI paper summaries
  • Google Sheets - track upvotes over time
  • Webhooks - trigger workflows on completion

Looking for more data collection tools? Check out these related actors:

ActorDescriptionLink
Hugging Face Model ScraperCollect AI model metadataLink
Apple App Store ScraperApp listings and ratingsLink
Stripe App Marketplace ScraperStripe app listingsLink
AWS Marketplace ScraperAWS product listingsLink
Hubspot Marketplace ScraperHubspot app listingsLink

Pro Tip: ๐Ÿ’ก Browse the full ParseForge catalog to find more data tools.

๐Ÿ†˜ Need Help?

โš ๏ธ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or connected to Hugging Face, arXiv, or any paper author. It collects only publicly available paper metadata.