Pricing

from $4.99 / 1,000 results

arXiv Research Paper Scraper

Extract comprehensive research paper data from arXiv search results including titles, authors, abstracts, categories, and more.

Pricing

from $4.99 / 1,000 results

Rating

0.0

(0)

Developer

Coding Frontned

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Features

Full paper metadata — arXiv ID, title, authors, abstract, categories, dates
PDF & abstract links — direct links to papers
Pagination — automatically iterates through pages to reach maxItems
Deduplication — no duplicate papers across pages
Flexible search — search by all fields, title, author, abstract, category, etc.
Sorting — sort by relevance, submission date, or last updated date
No anti-bot issues — arXiv is an open academic resource

Input Parameters

Field	Type	Default	Description
`query`	string	(required)	Search query (e.g. "large language models", "quantum computing")
`searchType`	string	`"all"`	Search field: `all`, `ti` (title), `au` (author), `abs` (abstract), `cat` (category)
`sortBy`	string	`"relevance"`	Sort by: `relevance`, `lastUpdatedDate`, `submittedDate`
`sortOrder`	string	`"descending"`	Sort order: `descending`, `ascending`
`maxItems`	integer	`50`	Maximum number of papers to extract (1–1000)
`proxyConfiguration`	object	—	Apify proxy config

Example INPUT.json

{
    "query": "large language models",
    "searchType": "all",
    "sortBy": "submittedDate",
    "sortOrder": "descending",
    "maxItems": 50
}

Output Fields

Field	Type	Description
`position`	integer	Rank in results (1-based)
`arxivId`	string	arXiv paper ID (e.g. `2401.12345`)
`title`	string	Full paper title
`authors`	array	List of author names
`abstract`	string	Full paper abstract
`primaryCategory`	string	Primary subject category (e.g. `cs.AI`)
`categories`	array	All subject categories
`submittedDate`	string	Original submission date
`updatedDate`	string	Last updated date
`abstractUrl`	string	URL to the abstract page
`pdfUrl`	string	Direct link to the PDF
`comments`	string	Author comments (e.g. "20 pages, 5 figures")
`journalRef`	string	Journal reference if published
`doi`	string	DOI if available
`reportNumber`	string	Report number if available
`searchQuery`	string	Query used for this result
`scrapedAt`	string	ISO 8601 timestamp

Example Output

{
    "position": 1,
    "arxivId": "2501.12345",
    "title": "Scaling Laws for Neural Language Models",
    "authors": ["Jared Kaplan", "Sam McCandlish"],
    "abstract": "We study empirical scaling laws for language model performance...",
    "primaryCategory": "cs.LG",
    "categories": ["cs.LG", "cs.CL", "stat.ML"],
    "submittedDate": "15 January, 2025",
    "updatedDate": null,
    "abstractUrl": "https://arxiv.org/abs/2501.12345",
    "pdfUrl": "https://arxiv.org/pdf/2501.12345",
    "comments": "35 pages, 14 figures",
    "journalRef": null,
    "doi": null,
    "searchQuery": "large language models",
    "scrapedAt": "2025-05-01T12:00:00.000Z"
}

Pagination

arXiv returns 25 results per page. The scraper automatically navigates through pages using the start offset parameter until maxItems is reached or no more results are available.

Use Cases

Academic research monitoring — track new papers in your field
Trend analysis — identify emerging topics and research directions
Author profiling — collect all papers by specific authors
Citation database — build reference datasets for research tools
Competitive intelligence — monitor publications from research groups
AI/ML dataset creation — collect paper abstracts for NLP training

Notes

arXiv is a free, open-access resource — no authentication needed
Results may vary slightly based on arXiv's real-time indexing
The abstract field contains the full abstract text
Use searchType: "au" to search by author name (e.g. "Hinton, Geoffrey")
Use searchType: "cat" with category codes like "cs.AI", "math.CO", "hep-th"

arXiv Search Scraper 📚

easyapi/arxiv-search-scraper

Extract comprehensive research paper data from arXiv search results. Get detailed metadata including titles, authors, abstracts, categories and more. Perfect for academic research monitoring, trend analysis and building paper databases. 🎓📚

EasyApi

ArXiv Research Paper Scraper

datapilot/arxiv-research-paper-scraper

arXiv Research Paper Scraper retrieves academic paper metadata from the arXiv API based on a keyword. It extracts titles, abstracts, authors with affiliations, DOI, categories, submission dates, and PDF links. Supports proxy usage and outputs structured JSON results for research and data analysis.

Data Pilot

arXiv Paper Scraper

plantane/arxiv-scraper

Scrape research papers from arXiv by search query or category. Get titles, abstracts, authors, categories, and PDF links via the public arXiv API.

Daniel

ArXiv Paper Search MCP

reverberant_equality/mcp-arxiv-search

Search ArXiv papers and retrieve paper details. AI agents can discover academic research, abstracts, authors, categories, and PDF links.

Jordan C

arXiv Paper Scraper

skystone_labs/arxiv-scraper

Extract research papers from arXiv using the official API. Get titles, authors, abstracts, PDF URLs, categories, and more. Perfect for research datasets and literature reviews.

Skystone

arXiv Research Paper Scraper

seeb/arxiv-research-paper-scraper

Scrape arXiv papers by keyword or category and return research titles, abstracts, authors, dates, links, and topic signals.

Techionik

arXiv Paper Search Scraper

fetch_cat/arxiv-paper-search-scraper

Search arXiv papers by keyword, author, category, and date using public paper metadata.

Hanna Nosova

arXiv Search & Paper Scraper

scrapeworks/arxiv-search

Search arXiv and get clean structured JSON for each paper: title, authors, abstract, categories, DOI, PDF link, and dates. Built for research, datasets, and AI pipelines.

Nicolas van Arkens

arXiv Paper Scraper

cloud9_ai/arxiv-paper-scraper

Scrape academic papers from arXiv.org. Search by keyword, browse categories, or get latest papers. Extract titles, abstracts, authors, PDF links, and citation data via arXiv API.