Pricing

$1.00 / 1,000 papers

arXiv Paper Scraper

Search and scrape academic papers from arXiv. Extract titles, authors, abstracts, categories, PDF links and publication dates by keyword, category or author. Ideal for research, literature reviews and building ML training datasets.

Pricing

$1.00 / 1,000 papers

Rating

0.0

(0)

Developer

Technical Dost Solutions

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

ArXiv Paper Scraper — Search & Export Research Papers (Titles, Abstracts, Authors, PDF Links)

ArXiv Paper Scraper is a fast, reliable way to search arXiv.org and export research papers as clean, structured JSON. Give it a search query or arXiv categories (like cs.AI, cs.LG, stat.ML) and the ArXiv Paper Scraper returns titles, abstracts, authors, publication dates, subject categories, DOIs, and direct PDF links — ready for spreadsheets, dashboards, literature reviews, or AI/ML trend monitoring.

Built for researchers, data scientists, ML engineers, librarians, and anyone who needs to scrape arXiv papers in bulk without writing API-parsing code. It uses the official arXiv API under the hood, so results are accurate and respect arXiv's rate limits.

Features

Search by keyword — full-text query across titles, abstracts, and authors (e.g. large language models, diffusion models, protein folding).
Filter by arXiv category — narrow results to one or more categories such as cs.AI, cs.LG, stat.ML, physics.gen-ph, math.OC.
Bulk export — pull up to 1,000 papers per run with automatic batching and pagination.
Sort control — order by submission date, last-updated date, or relevance, ascending or descending.
Rich structured output — every paper includes title, abstract/summary, authors, primary + all categories, published/updated timestamps, DOI, journal reference, and PDF/HTML/abstract links.
Clean JSON / CSV / Excel — export from the dataset in any format, or pull via the Apify API into Zapier, Make, Google Sheets, and more.

Input

Field	Type	Description
`searchQuery`	string	Search terms for finding papers (e.g. `machine learning`).
`categories`	array	arXiv categories to filter (e.g. `cs.AI`, `cs.LG`, `stat.ML`). Optional.
`maxResults`	integer	Maximum number of papers to extract (1–1000, default 100).
`sortBy`	string	`submittedDate`, `lastUpdatedDate`, or `relevance`.
`sortOrder`	string	`descending` or `ascending`.

Example input:

{
  "searchQuery": "large language models",
  "categories": ["cs.CL", "cs.AI"],
  "maxResults": 200,
  "sortBy": "submittedDate",
  "sortOrder": "descending"
}

Output

Each paper is stored as one dataset item:

{
  "id": "2406.01234v1",
  "title": "A Survey of Large Language Models for Scientific Discovery",
  "summary": "We review recent advances in applying large language models to...",
  "authors": ["Jane Doe", "John Smith"],
  "published": "2024-06-03T17:59:00Z",
  "updated": "2024-06-05T12:10:00Z",
  "categories": ["cs.CL", "cs.AI"],
  "primaryCategory": "cs.CL",
  "links": {
    "abstract": "http://arxiv.org/abs/2406.01234v1",
    "pdf": "http://arxiv.org/pdf/2406.01234v1",
    "html": "http://arxiv.org/abs/2406.01234v1"
  },
  "doi": null,
  "comment": "12 pages, 4 figures",
  "journalRef": null,
  "scrapedAt": "2024-06-06T09:00:00Z"
}

How to scrape arXiv papers

Enter a Search Query (e.g. reinforcement learning) and/or one or more Categories (e.g. cs.LG).
Set Maximum Results and choose how to Sort them.
Click Start — the ArXiv Paper Scraper queries the official arXiv API, paginates through results, and writes each paper to the dataset.
Download the results as JSON, CSV, or Excel, or fetch them through the Apify API and pipe them into your tools.

FAQ

Which arXiv categories can I use? Any official arXiv category code, such as cs.AI, cs.LG, cs.CL, cs.CV, stat.ML, math.OC, physics.gen-ph, q-bio, or econ.EM. You can pass several at once.

How many papers can I get per run? Up to 1,000 per run. The scraper batches requests automatically and respects arXiv's rate limits.

Can I get the full PDF text? The scraper returns the direct PDF link for each paper (links.pdf). You can feed those links into a PDF-text extractor if you need the full body.

Is this allowed? Yes — it uses the public, official arXiv API and honors its usage guidelines, including rate limiting between batches.

Can I automate it? Yes. Schedule runs on Apify or trigger them via the API, and connect the output to Zapier, Make, Google Sheets, Slack, or a database.

Pricing

This Actor uses pay-per-event pricing — you only pay for what you run, with no monthly subscription. Most search-and-export jobs cost a fraction of a cent in compute. See the Pricing tab on this Actor's page for current per-event rates.

ArXiv Academic Paper Scraper

fortuitous_pirate/arxiv-scraper

Scrape academic papers from ArXiv. Extract titles, authors, abstracts, categories, and PDF links. Essential for research and literature reviews.

Fortuitous Pirate

arXiv Papers Scraper

resounding_diplomacy/arxiv-papers-scraper

Scrape academic papers from arXiv by category, keyword, or author. Extract titles, authors, abstracts, PDF URLs, DOIs, categories, and more. Perfect for AI/ML research datasets.

alars num

arXiv Paper Scraper

skystone_labs/arxiv-scraper

Extract research papers from arXiv using the official API. Get titles, authors, abstracts, PDF URLs, categories, and more. Perfect for research datasets and literature reviews.

Skystone

arXiv Paper Scraper - AI ML Research Papers

openclawmara/arxiv-paper-scraper

Scrape arXiv research papers by keyword, category, or author. Extracts titles, abstracts, authors, citations, and metadata. Perfect for AI/ML research monitoring, literature reviews, and LLM training data collection.

OpenClaw Mara

arXiv Paper Scraper

plantane/arxiv-scraper

Scrape research papers from arXiv by search query or category. Get titles, abstracts, authors, categories, and PDF links via the public arXiv API.

Daniel

arXiv Paper Scraper

lulzasaur/arxiv-scraper

Search and scrape arXiv academic papers. Get titles, authors, abstracts, categories, PDF links, DOIs. Search by keyword, browse recent papers by category, or fetch by arXiv ID.

lulz bot

arXiv Paper Scraper

cloud9_ai/arxiv-paper-scraper

Scrape academic papers from arXiv.org. Search by keyword, browse categories, or get latest papers. Extract titles, abstracts, authors, PDF links, and citation data via arXiv API.

cloud9

arXiv Paper Scraper — Search Academic Papers & Abstracts

puskin/arxiv-scraper

Search and retrieve academic papers from arXiv by keyword, author, or category. Extracts titles, authors, abstracts, and download links via the free arXiv API — no authentication needed.

Giovanni Bucci

ArXiv Papers Scraper — Research Paper API

fast_api/arxiv-papers-scraper

Search and extract ArXiv research papers as structured JSON: titles, authors, abstracts, categories, dates, PDFs, and metadata. Built for AI research monitoring, literature review, RAG datasets, and academic intelligence.

Fast API

ArXiv Paper Search

gentle_cloud/arxiv-paper-search

Search and extract academic papers from ArXiv. Find papers by keyword, author, or category with full metadata including title, authors, abstract, categories, and PDF links.