Hugging Face Papers Scraper
Pricing
from $9.00 / 1,000 results
Hugging Face Papers Scraper
Scrape AI and machine learning research papers from Hugging Face Papers. Get titles, abstracts, authors with affiliations, upvotes, publication dates, ArXiv IDs, and community discussion counts. Search by keyword or browse daily papers.
Pricing
from $9.00 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Actor stats
1
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share

๐ Hugging Face Papers Scraper
AI research moves fast. Hugging Face Papers curates the most discussed machine learning papers every day with community upvotes, author handles, and linked code repos. This tool pulls that curated feed plus search results into a structured dataset you can feed into a newsletter, research tracker, or literature review.
The Hugging Face Papers Scraper collects AI/ML research papers with titles, authors, abstracts, arXiv IDs, upvotes, GitHub repos, thumbnails, and keywords. Search by topic or grab the daily trending list.
โจ What Does It Do
- ๐ Paper metadata - titles, abstracts, arXiv IDs, publication dates, and Hugging Face URLs
- ๐ฅ Author details - full author list with Hugging Face handles and verification status
- โญ Community signals - upvotes, comment counts, and thumbnails
- ๐ป Code + project links - GitHub repository URLs and project pages when authors link them
- ๐ท๏ธ AI keywords and summaries - auto-generated keywords and condensed summaries where available
- ๐ Two modes - search by keyword or pull today's trending feed
๐ง Input
- Search Query - keyword to match paper titles and abstracts (e.g.
transformer,diffusion model,LLM) - Mode -
searchfor keyword search ortrendingfor daily curated papers - Max Items - free users get 10 papers, paid users up to 1,000,000
{"searchQuery": "diffusion model","mode": "search","maxItems": 100}
๐ Output
Each paper record contains 15+ fields. Download as JSON, CSV, or Excel.
| ๐ Field | ๐ Description |
|---|---|
| ๐ arxivId | arXiv paper identifier |
| ๐ title | Paper title |
| ๐ url | Hugging Face Papers page |
| ๐ arxivUrl | arXiv abstract page |
| ๐ publishedAt | Publication date |
| โฌ๏ธ upvotes | Community upvote count |
| ๐ฌ numComments | Discussion comment count |
| ๐ฅ authors | Array of authors with names and HF handles |
| ๐ summary | Paper abstract |
| ๐ผ๏ธ thumbnail | Paper preview image |
| ๐ป githubRepo | Linked code repository |
| ๐ projectPage | Linked project website |
| ๐ท๏ธ aiKeywords | Auto-generated topic keywords |
{"arxivId": "2404.12345","title": "Efficient Attention for Long-Context Language Models","url": "https://huggingface.co/papers/2404.12345","arxivUrl": "https://arxiv.org/abs/2404.12345","publishedAt": "2026-04-09","upvotes": 187,"numComments": 12,"numAuthors": 6,"firstAuthor": "Jane Smith","authors": [{ "name": "Jane Smith", "hfUser": "jsmith", "verified": true }],"summary": "We introduce a novel attention mechanism...","githubRepo": "https://github.com/example/long-attention","projectPage": "https://example.github.io/long-attention","aiKeywords": ["attention", "long-context", "efficiency"],"scrapedAt": "2026-04-10T12:00:00.000Z"}
๐ Why Choose the Hugging Face Papers Scraper?
| Feature | Our Tool | Manual Browsing |
|---|---|---|
| Daily trending feed | โ Yes | โ Yes |
| Keyword search | โ Yes | โ ๏ธ Limited UI |
| Bulk export | โ Up to 1M papers | โ One at a time |
| Author handles | โ Included | โ ๏ธ Click each profile |
| Linked code + project | โ Extracted | โ ๏ธ Scroll through page |
| Scheduled monitoring | โ Daily runs | โ Not possible |
๐ How to Use
- Sign Up - Create a free account w/ $5 credit
- Configure - pick a keyword or trending mode and set your max items
- Run It - click Start and get curated AI papers in seconds
No coding, no daily manual browsing.
๐ฏ Business Use Cases
- ๐ฌ Research newsletters - auto-curate a weekly digest of the hottest ML papers
- ๐ง AI labs - build an internal literature tracker for new diffusion, LLM, or RL work
- ๐ PhD students - monitor new papers in your subfield without daily site visits
- ๐ Trend analysis - track which topics are gaining community upvotes over time
- ๐ผ Recruiters - spot up-and-coming researchers by watching trending author handles
- ๐ป Dev tool makers - find papers with open code to feature in your product
โ FAQ
๐ค What is Hugging Face Papers? Hugging Face Papers is a curated feed of AI/ML research papers with community voting, author profiles, and links to code repos and project pages.
๐ What's the difference between search and trending mode? Trending pulls today's curated daily papers chosen by the Hugging Face team and community. Search runs a keyword query across indexed papers.
โญ What does "upvotes" mean? Upvotes are community signals from Hugging Face users indicating which papers they think are most worth reading.
๐ป Are GitHub repos always available? Only when the paper's authors or the community have linked them. Many papers include code, but not all.
๐ Can I run this daily? Yes. Set up a scheduled run in trending mode to keep a daily archive of the most discussed ML work.
๐ Integrate Hugging Face Papers Scraper with any app
- Make - automate paper digest generation
- Zapier - push new papers to your reading list
- Slack - post daily AI paper summaries
- Google Sheets - track upvotes over time
- Webhooks - trigger workflows on completion
๐ก Recommended Actors
Looking for more data collection tools? Check out these related actors:
| Actor | Description | Link |
|---|---|---|
| Hugging Face Model Scraper | Collect AI model metadata | Link |
| Apple App Store Scraper | App listings and ratings | Link |
| Stripe App Marketplace Scraper | Stripe app listings | Link |
| AWS Marketplace Scraper | AWS product listings | Link |
| Hubspot Marketplace Scraper | Hubspot app listings | Link |
Pro Tip: ๐ก Browse the full ParseForge catalog to find more data tools.
๐ Need Help?
- Check the FAQ section above for common questions
- Visit the Apify documentation for platform guides
- Contact us at Tally contact form
โ ๏ธ Disclaimer
This Actor is an independent tool and is not affiliated with, endorsed by, or connected to Hugging Face, arXiv, or any paper author. It collects only publicly available paper metadata.