ArXiv Scraper - Extract Research Papers, Authors & Citations
Pricing
Pay per usage
ArXiv Scraper - Extract Research Papers, Authors & Citations
Scrape ArXiv academic papers including titles, abstracts, authors, categories, and PDF links. Extract research papers from physics, mathematics, computer science, and more. Ideal for academic research, literature reviews, AI/ML paper tracking, and scientific trend analysis.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Fatih Dağüstü
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
arXiv Scraper — Extract Research Papers, Abstracts & Citations
Scrape arXiv for research papers, abstracts, author information, and citation data. Perfect for literature reviews, research monitoring, and academic data collection.
Why Use This Actor?
- Structured output — Get clean, structured data from arXiv's paper listings
- Bulk extraction — Scrape entire categories, search results, or author papers
- Beyond the API — Faster and more flexible than arXiv's rate-limited API
- Full metadata — Titles, abstracts, authors, categories, PDF links, and dates
Features
- Paper Search — Search arXiv papers by keyword, title, or author
- Category Browsing — Extract recent papers from any arXiv category (cs.AI, physics, math, etc.)
- Paper Details — Title, abstract, authors, submission date, categories, PDF/HTML links
- Author Papers — Get all papers by a specific author
- Date Filtering — Filter papers by submission date range
- Sorting — Sort by relevance, submission date, or last updated
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
searchQuery | string | Yes | Search term for papers (e.g., 'large language models', 'quantum computing'). |
searchField | string | No | Which field to search in. |
category | string | No | ArXiv category (e.g., cs.AI, cs.LG, physics, math, q-bio). Leave empty for all. |
sortBy | string | No | How to sort results. |
maxItems | integer | No | Maximum number of papers to scrape. |
Output Example
{"type": "paper","title": "Attention Is All You Need: Revisited for Modern LLM Architectures","authors": ["Jane Smith", "John Doe", "Alice Johnson"],"abstract": "We present a comprehensive analysis of attention mechanisms in modern large language model architectures...","categories": ["cs.CL", "cs.AI", "cs.LG"],"primaryCategory": "cs.CL","submittedDate": "2026-02-15","arxivId": "2602.12345","pdfUrl": "https://arxiv.org/pdf/2602.12345","abstractUrl": "https://arxiv.org/abs/2602.12345"}
Use Cases
- Literature Review — Quickly collect papers for systematic reviews and meta-analyses
- Research Monitoring — Track new publications in your field daily
- Trend Analysis — Identify emerging research topics by analyzing paper volumes
- AI/ML Tracking — Monitor the latest machine learning and AI research
- Academic Dashboards — Build research tracking dashboards for your team
- Dataset Building — Create datasets of paper abstracts for NLP research
Cost Estimation
| Scale | Estimated Cost | Time |
|---|---|---|
| 100 papers | ~$0.05 | ~1 minute |
| 1,000 search results | ~$0.20 | ~4 minutes |
| Category listing (recent 500) | ~$0.10 | ~2 minutes |
Support
For issues, feature requests, or custom scraping needs, contact us at fatihdagustu20@gmail.com
Built with Crawlee and Apify SDK. Maintained and updated regularly.