arXiv Research Papers Tracker
Pricing
Pay per usage
arXiv Research Papers Tracker
Search and extract academic papers from arXiv by category, keyword, date range. Returns paper title, authors, abstract, categories, published date, PDF URL. Ideal for AI/ML research monitoring and training data collection.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
陈俊杰
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
An Apify Actor that searches and extracts academic papers from arXiv by category, keyword, and date range. Ideal for AI/ML research monitoring, literature reviews, and training-data collection.
Features
- Category search — search one or more arXiv categories (e.g.
cs.AI,cs.LG,stat.ML). - Keyword filtering — narrow results to papers whose title or abstract contains specific terms.
- Pagination — automatically fetches up to 200 results with polite 3-second delays between pages.
- Rich output — returns title, authors, abstract, categories, published/updated dates, PDF URL, and arXiv ID.
Input
| Field | Type | Default | Description |
|---|---|---|---|
categories | string | cs.AI,cs.LG,stat.ML | Comma-separated arXiv category codes |
keywords | string | (optional) | Space-separated search terms (title/abstract) |
max_results | integer | 50 | Maximum number of papers (≤ 200) |
sort_by | enum | submittedDate | submittedDate or relevance |
Output
Each result is a JSON object pushed to the Apify dataset with the following fields:
| Field | Type | Description |
|---|---|---|
id | string | arXiv identifier (e.g. 2101.12345) |
url | string | arXiv abstract page URL |
title | string | Paper title |
authors | string[] | List of author names |
abstract | string | Paper abstract / summary |
categories | string | Comma-separated category codes |
primary_category | string | Primary arXiv category |
published | string | Original publication date (ISO‑8601) |
updated | string | Last update date (ISO‑8601) |
pdf_url | string | Direct link to the PDF |
Common arXiv Category Codes
Computer Science (cs.*)
| Code | Description |
|---|---|
cs.AI | Artificial Intelligence |
cs.AR | Hardware Architecture |
cs.CC | Computational Complexity |
cs.CE | Computational Engineering, Finance, and Science |
cs.CL | Computation and Language (NLP) |
cs.CR | Cryptography and Security |
cs.CV | Computer Vision and Pattern Recognition |
cs.CY | Computers and Society |
cs.DB | Databases |
cs.DC | Distributed, Parallel, and Cluster Computing |
cs.DL | Digital Libraries |
cs.DS | Data Structures and Algorithms |
cs.ET | Emerging Technologies |
cs.GL | General Literature |
cs.GT | Computer Science and Game Theory |
cs.HC | Human-Computer Interaction |
cs.IR | Information Retrieval |
cs.IT | Information Theory |
cs.LG | Machine Learning |
cs.LO | Logic in Computer Science |
cs.MA | Multiagent Systems |
cs.NE | Neural and Evolutionary Computing |
cs.NI | Networking and Internet Architecture |
cs.PL | Programming Languages |
cs.RO | Robotics |
cs.SE | Software Engineering |
cs.SI | Social and Information Networks |
cs.SY | Systems and Control |
Statistics (stat.*)
| Code | Description |
|---|---|
stat.AP | Applications |
stat.CO | Computation |
stat.ME | Methodology |
stat.ML | Machine Learning |
stat.TH | Statistics Theory |
Mathematics (math.*)
| Code | Description |
|---|---|
math.NA | Numerical Analysis |
math.OC | Optimization and Control |
math.PR | Probability |
math.ST | Statistics Theory |
Physics (physics.*) & Other
| Code | Description |
|---|---|
physics.* | Various physics sub-disciplines |
q-fin.* | Quantitative Finance |
q-bio.* | Quantitative Biology |
eess.* | Electrical Engineering and Systems Science |
See the full arXiv category list.
Local Development
# Clone / navigate to the projectcd ~/apify-actors/arxiv-papers-scraper# Install dependenciespip install -r requirements.txt# Run the actor (requires Apify API token when using Apify platform features)python -m src
To run with custom input via the Apify CLI:
$apify run
License
MIT