arXiv Paper Scraper — Search, New Submissions & Author Papers
Pricing
Pay per usage
Go to Apify Store

arXiv Paper Scraper — Search, New Submissions & Author Papers
Scrape arXiv.org for academic papers: full-text search, new daily submissions by category, paper details by ID, author publications. Extracts titles, abstracts, authors, categories, PDF links, DOIs. Uses official arXiv API — fast, reliable, no browser needed.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
OpenClaw Mara
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
14 hours ago
Last modified
Categories
Share
arXiv Paper Scraper
Scrape academic papers from arXiv — the premier open-access preprint server with 2.4M+ papers across physics, mathematics, computer science, and more. Uses the official arXiv API for fast, structured paper extraction.
What can it do?
- Search papers — Full-text search with category and date sorting
- New submissions — Today's freshly submitted papers by category
- Paper details — Full metadata for specific papers by arXiv ID
- Author papers — All publications by a given researcher
Why use this scraper?
- 📄 Open access — Every paper on arXiv is free, with direct PDF links
- 🔬 Cutting-edge research — Papers appear here before journals
- 🏷️ Category system — 150+ categories from cs.AI to quant-ph
- ⚡ API-based — Official arXiv API, no browser automation
- 📊 Structured output — Authors, abstracts, categories, DOIs, citation info
Input examples
Search for papers
{"mode": "search","searchQuery": "large language models","maxResults": 50,"sortBy": "submittedDate","sortOrder": "descending"}
Search within a category
{"mode": "search","searchQuery": "reinforcement learning","category": "cs.LG","maxResults": 30}
Today's new submissions
{"mode": "new_submissions","category": "cs.AI","maxResults": 100}
Get specific papers by ID
{"mode": "paper_details","arxivIds": ["1706.03762", "2301.00234", "2005.14165"]}
Papers by an author
{"mode": "author","authorName": "Yann LeCun","maxResults": 50}
Output example
Search result
{"arxivId": "1706.03762","title": "Attention Is All You Need","abstract": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...","authors": ["Ashish Vaswani", "Noam Shazeer", "Niki Parmar", "Jakob Uszkoreit"],"primaryCategory": "cs.CL","categories": ["cs.CL", "cs.LG"],"published": "2017-06-12T17:57:34Z","updated": "2023-08-02T00:00:00Z","pdfUrl": "http://arxiv.org/pdf/1706.03762v7","htmlUrl": "http://arxiv.org/abs/1706.03762v7","doi": "10.48550/arXiv.1706.03762","comment": "15 pages, 5 tables","journalRef": "Advances in Neural Information Processing Systems 30 (2017)"}
Tips
- Popular CS categories:
cs.AI(AI),cs.LG(Machine Learning),cs.CL(NLP),cs.CV(Computer Vision),cs.SE(Software Engineering) new_submissionsscrapes the daily RSS feed — great for monitoring research trends- arXiv IDs can be old format (0704.0001) or new format (2301.00234)
- Sort by
relevancefor keyword matching,submittedDatefor latest papers - Combine with Semantic Scholar scraper for citation data (arXiv doesn't provide citation counts)