Pricing

Pay per usage

arXiv Papers Scraper with AI Topic Tags

Search arXiv.org for academic papers by keyword, author, or category. Get clean structured data with optional AI topic tagging via Claude. Perfect for literature reviews, research monitoring, and academic datasets.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Andrei

Actor stats

Bookmarked

Total users

Monthly active users

4 days ago

Last modified

arXiv Papers Scraper with AI Tags

Search arXiv.org for academic papers by keyword, author, or category. Get clean structured data with optional AI-powered topic tagging via Claude. Perfect for literature reviews, research monitoring, and building academic datasets.

What this actor does

arXiv has 2M+ papers but their search interface is clunky and there's no direct way to export results. This actor solves that:

Full arXiv search syntax — search by keyword, title, abstract, authors, or category
Category filter — restrict to specific fields (cs.AI, math.PR, physics.bio-ph, etc.)
AI topic tagging — Claude reads each abstract and assigns 3-5 relevant tags (optional, BYOK)
Citation extraction — pulls cited references from paper metadata when available
Retry logic — handles arXiv API rate limits and transient errors gracefully

Quick start

Just search for something:

{
  "searchQuery": "transformer attention mechanism",
  "maxResults": 20
}

That's it. The actor will return up to 20 papers matching your query with full metadata.

Input fields

searchQuery (required) — Search terms (keyword, author, title, or arXiv ID)
category — Filter by arXiv category code (cs.AI, math.ST, etc., leave empty for all)
maxResults — Number of papers to fetch (default 50, max 1000)
sortBy — Sort by relevance, lastUpdatedDate, or submittedDate (default relevance)
enableAiTags — Generate AI topic tags for each paper (default false)
anthropicApiKey — Your Anthropic API key (BYOK, required if AI tags enabled)
extractCitations — Pull cited references metadata when available (default true)

Output format

Each item in the dataset:

{
  "id": "2412.01234",
  "title": "Attention Is All You Need: A Survey",
  "authors": ["Vaswani A.", "Shazeer N."],
  "abstract": "The dominant sequence transduction models...",
  "publishedDate": "2024-12-01",
  "updatedDate": "2024-12-15",
  "pdfUrl": "https://arxiv.org/pdf/2412.01234.pdf",
  "absUrl": "https://arxiv.org/abs/2412.01234",
  "categories": ["cs.LG", "cs.AI"],
  "primaryCategory": "cs.LG",
  "doi": "10.xxxx/yyyy",
  "comment": "Accepted at NeurIPS 2024",
  "journalRef": null,
  "aiTags": ["transformer architecture", "attention mechanism", "survey paper"],
  "citationCount": 12
}

Field aiTags appears only with AI tagging enabled.

Use cases

Literature review — Pull all papers on your research topic from the last 6 months in one query.

Research monitoring — Schedule daily runs to track new arXiv submissions in your field.

Dataset building — Collect abstracts and metadata for training NLP models on academic text.

Trend analysis — Aggregate AI tags across thousands of papers to spot emerging research topics.

Citation tracking — Build citation graphs from extracted references for bibliometric studies.

Technical notes

Uses arXiv's official Atom API — fully ToS-compliant, no scraping
Automatic retry with exponential backoff for rate limits (arXiv allows ~3 req/sec)
AI tagging uses Claude Haiku 4.5 (fast and cheap, ~$0.001 per paper)
All abstracts and metadata are public domain (arXiv license)
Citation extraction works only for papers with structured reference metadata

Pricing

Currently free during early access. Pay-per-paper pricing will be enabled later.

Support

Found a bug? Have feature requests? Contact the developer through the actor's page on Apify.

arXiv Paper Scraper

cloud9_ai/arxiv-paper-scraper

Scrape academic papers from arXiv.org. Search by keyword, browse categories, or get latest papers. Extract titles, abstracts, authors, PDF links, and citation data via arXiv API.

cloud9

ArXiv Academic Paper Scraper

fortuitous_pirate/arxiv-scraper

Scrape academic papers from ArXiv. Extract titles, authors, abstracts, categories, and PDF links. Essential for research and literature reviews.

Fortuitous Pirate

Arxiv Papers Scraper

chimerical_quicklime/arxiv-papers-scraper

Search arXiv preprints via the public Atom API. Returns title, authors, abstract, categories, published date, updated date, DOI, journal reference, and PDF link. Filter by category, author, or keyword.

Khrystyna Skotte

ArXiv Paper Search

gentle_cloud/arxiv-paper-search

Search and extract academic papers from ArXiv. Find papers by keyword, author, or category with full metadata including title, authors, abstract, categories, and PDF links.

Monkey Coder

arXiv Paper Scraper - AI ML Research Papers

openclawmara/arxiv-paper-scraper

Scrape arXiv research papers by keyword, category, or author. Extracts titles, abstracts, authors, citations, and metadata. Perfect for AI/ML research monitoring, literature reviews, and LLM training data collection.

OpenClaw Mara

arXiv Papers Scraper

crawlerbros/arxiv-papers-scraper

Scrape academic preprints from arXiv.org by keyword, author, or category. Returns clean records with title, authors, abstract, categories, PDF URL, DOI. HTTP-only via the public arXiv API. No login, no proxy.

Crawler Bros

5.0

Arxiv Keyword Spider

getdataforme/arxiv-keyword-spider

Arxiv Keyword Spider efficiently scrapes arXiv.org for research papers using keywords, delivering comprehensive metadata like titles, authors, abstracts, and categories. Perfect for academic research, market analysis, and trend monitoring....

GetDataForMe

ArXiv Paper Scraper

sheshinmcfly/arxiv-paper-scraper

Search and extract scientific papers from ArXiv.org across any field. Returns title, authors, full abstract, PDF link, arXiv ID, categories, and submission date. Ideal for AI research monitoring, RAG pipelines, literature reviews, and academic trend analysis. No API key needed.

Sheshinmcfly

arXiv Paper Scraper

skystone_labs/arxiv-scraper

Extract research papers from arXiv using the official API. Get titles, authors, abstracts, PDF URLs, categories, and more. Perfect for research datasets and literature reviews.

Skystone

arXiv Scraper

artificially/arxiv-scraper

Search and extract academic papers from arXiv.org. Get paper titles, authors, abstracts, categories, and PDF links for AI/ML, physics, math, and more.