ArXiv Scraper - Extract Research Papers, Authors & Citations avatar

ArXiv Scraper - Extract Research Papers, Authors & Citations

Pricing

Pay per usage

Go to Apify Store
ArXiv Scraper - Extract Research Papers, Authors & Citations

ArXiv Scraper - Extract Research Papers, Authors & Citations

Scrape ArXiv academic papers including titles, abstracts, authors, categories, and PDF links. Extract research papers from physics, mathematics, computer science, and more. Ideal for academic research, literature reviews, AI/ML paper tracking, and scientific trend analysis.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Fatih Dağüstü

Fatih Dağüstü

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

2 days ago

Last modified

Categories

Share

arXiv Scraper — Extract Research Papers, Abstracts & Citations

Scrape arXiv for research papers, abstracts, author information, and citation data. Perfect for literature reviews, research monitoring, and academic data collection.

Why Use This Actor?

  • Structured output — Get clean, structured data from arXiv's paper listings
  • Bulk extraction — Scrape entire categories, search results, or author papers
  • Beyond the API — Faster and more flexible than arXiv's rate-limited API
  • Full metadata — Titles, abstracts, authors, categories, PDF links, and dates

Features

  • Paper Search — Search arXiv papers by keyword, title, or author
  • Category Browsing — Extract recent papers from any arXiv category (cs.AI, physics, math, etc.)
  • Paper Details — Title, abstract, authors, submission date, categories, PDF/HTML links
  • Author Papers — Get all papers by a specific author
  • Date Filtering — Filter papers by submission date range
  • Sorting — Sort by relevance, submission date, or last updated

Input Parameters

ParameterTypeRequiredDescription
searchQuerystringYesSearch term for papers (e.g., 'large language models', 'quantum computing').
searchFieldstringNoWhich field to search in.
categorystringNoArXiv category (e.g., cs.AI, cs.LG, physics, math, q-bio). Leave empty for all.
sortBystringNoHow to sort results.
maxItemsintegerNoMaximum number of papers to scrape.

Output Example

{
"type": "paper",
"title": "Attention Is All You Need: Revisited for Modern LLM Architectures",
"authors": ["Jane Smith", "John Doe", "Alice Johnson"],
"abstract": "We present a comprehensive analysis of attention mechanisms in modern large language model architectures...",
"categories": ["cs.CL", "cs.AI", "cs.LG"],
"primaryCategory": "cs.CL",
"submittedDate": "2026-02-15",
"arxivId": "2602.12345",
"pdfUrl": "https://arxiv.org/pdf/2602.12345",
"abstractUrl": "https://arxiv.org/abs/2602.12345"
}

Use Cases

  • Literature Review — Quickly collect papers for systematic reviews and meta-analyses
  • Research Monitoring — Track new publications in your field daily
  • Trend Analysis — Identify emerging research topics by analyzing paper volumes
  • AI/ML Tracking — Monitor the latest machine learning and AI research
  • Academic Dashboards — Build research tracking dashboards for your team
  • Dataset Building — Create datasets of paper abstracts for NLP research

Cost Estimation

ScaleEstimated CostTime
100 papers~$0.05~1 minute
1,000 search results~$0.20~4 minutes
Category listing (recent 500)~$0.10~2 minutes

Support

For issues, feature requests, or custom scraping needs, contact us at fatihdagustu20@gmail.com


Built with Crawlee and Apify SDK. Maintained and updated regularly.