AI Papers Tracker (arXiv + PWC)
Pricing
Pay per usage
AI Papers Tracker (arXiv + PWC)
Track new AI / agent / LLM research papers from arXiv + Papers With Code, filterable by keywords. Ranked by trending score (recency + match + category + code attached). Daily refresh for researchers and operators.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Yanlong Mu
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
AI Papers Tracker (arXiv + Papers With Code)
Track new AI / agent / LLM research papers from arXiv + Papers With Code, filtered by your own keyword list. Returns a ranked, daily-refreshable dataset of the most relevant recent work.
What does AI Papers Tracker do?
The AI / agent / LLM research output now runs at hundreds of new arXiv preprints per week, and filtering signal from noise is impossible by hand. This Actor tracks two of the highest-signal feeds — arXiv (cs.AI / cs.CL / cs.LG / cs.MA / cs.SE) and Papers With Code (papers that ship runnable code) — filters by your own keyword list (e.g. llm agent, agent verification, code generation evaluation), and returns a deduplicated, scored, ranked dataset.
You get the latest 30-day-window papers ranked by trending score: recency + how many of your keywords matched + relevant arXiv category + whether code is attached. Pair it with the Apify scheduler to run it daily and you have a personal arXiv digest that does not miss anything in your niche. Apify gives you free scheduling, REST API, webhooks, and integrations (Slack / Make / Zapier) on top.
Why use AI Papers Tracker?
- AI researchers: a personalized "what's new in my subfield this week" digest, no manual scraping
- Journalists & analysts: discover papers worth covering before they hit Twitter
- Founders & operators: track benchmarks (agent coding, verification, eval) so your product does not lag the frontier
- Investors: map which labs / authors keep publishing in a thesis area
- Course / curriculum builders: weekly refresh of recommended reading
How to use AI Papers Tracker
- Open the Input tab
- Edit the Keywords list — phrases that should appear in titles / abstracts (e.g.
RAG evaluation,tool-use agent,code review LLM) - Set Days back (default 30) and Max results (default 30)
- Click Start
- Download the Dataset or the human-readable
ai-papers-report.mdfrom the Storage tab - Schedule it daily / weekly in the Schedules tab for a continuous feed
Input
keywords— array of phrases. Each phrase is searched independently across arXiv title / abstract + PWC search. Results are deduped by arXiv ID.daysBack— only include papers submitted in the last N days (1-365)maxResults— cap the dataset to the top N scored papers (1-200)
Output
Each row in the dataset:
{"title": "MAST: Multi-Agent System Failure Modes","authors": ["Cemri et al."],"abstract": "First 400 chars of the abstract…","arxivId": "2503.13657","url": "https://arxiv.org/abs/2503.13657","submittedAt": "2026-05-15","categories": ["cs.AI", "cs.MA"],"source": "arxiv","hasCode": true,"matchedKeywords": ["agent verification", "llm agent"],"trendingScore": 78}
You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.
Data table
| Field | Type | Description |
|---|---|---|
title | string | Paper title |
authors | string[] | Authors list (full for arXiv, first few for PWC) |
abstract | string | Truncated to 400 chars |
arxivId | string | null | arXiv identifier (e.g. 2503.13657) when available |
url | string | Direct link to the paper page |
submittedAt | date | YYYY-MM-DD submission date |
categories | string[] | arXiv categories (e.g. cs.AI) |
source | string | arxiv or paperswithcode |
hasCode | boolean | Paper appears on Papers With Code (i.e. ships code) |
matchedKeywords | string[] | Which of your keywords matched this paper |
trendingScore | number | 0-100 composite ranking score |
Trending score (max 100)
| Dimension | Max | What it measures |
|---|---|---|
| Recency | 50 | Closer to today within the window = higher |
| Keyword relevance | 32 | Number of distinct keywords matched (8 each) |
| Category fit | 10 | Counts cs.AI / cs.CL / cs.LG / cs.MA / cs.SE |
| Code attached | 8 | Paper appears on Papers With Code |
Pricing / Cost estimation
Pay-per-result: roughly $0.05 per 10 papers scored. A daily refresh tracking 4 keywords over 30 days typically returns 30-60 results = under $1/day. The Apify free trial covers your first run end-to-end so you can validate output before subscribing.
Tips / Advanced options
- Cluster keywords semantically: do not add both
LLMandlarge language model— they will overlap ~80% and double your runtime. - arXiv rate limit: this Actor sleeps 3.1 seconds between arXiv calls (arXiv's published rate-limit courtesy delay). With 8+ keywords expect a ~30s run.
- Schedule weekly + diff: for a real "what's new this week" digest, schedule weekly and diff the dataset against last week's run.
- Combine with
mcp-server-catalog: this Actor catalogues the research; that one catalogues the tools. Together = full landscape view.
FAQ, disclaimers, and support
Why arXiv + Papers With Code specifically?
arXiv = full firehose of preprints (titles + abstracts indexed). Papers With Code = the subset that ships code (a strong quality + reproducibility signal). Together they cover ~99% of what an AI engineer or researcher actually needs to read.
Is the data current?
Both APIs return live data at run time. Schedule daily for a true real-time feed.
Why is paper X not in the results?
Either (a) it was submitted outside the daysBack window, (b) none of your keywords matched the title or abstract, or (c) it is published in a non-cs.* arXiv category (e.g. stat.ML — open an issue to request).
Legality
The arXiv API and Papers With Code REST API are both public, documented APIs with no rate limit beyond the arXiv 3-second courtesy delay (which this Actor respects). No login, no scraping of restricted content.
Support
Open an issue on the Issues tab of this Actor's Apify Console page, or via the author's GitHub.
Built by Ian Mu (github.com/ianymu) — author of verify-before-stop, a Claude Code harness that gates session-stop on real, verified output.