Arxiv Papers Scraper avatar

Arxiv Papers Scraper

Pricing

Pay per usage

Go to Apify Store
Arxiv Papers Scraper

Arxiv Papers Scraper

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Khrystyna Skotte

Khrystyna Skotte

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

12 hours ago

Last modified

Categories

Share

arXiv Papers Scraper - Academic Preprints Search

Search arXiv preprints via the public Atom API. Returns title, authors, abstract, categories, published date, updated date, DOI, journal reference, and PDF link. Filter by category, author, or keyword.

Quick start

Run with the default input — no configuration needed for the first try:

{
"query": "cat:cs.AI",
"sortBy": "submittedDate",
"sortOrder": "descending",
"maxItems": 10
}

Or trigger via the Apify API:

curl -X POST "https://api.apify.com/v2/acts/<USERNAME>~arxiv-papers-scraper/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{}'

Input parameters

FieldTypeDefaultDescription
querystringcat:cs.AIarXiv search query. Use 'cat:cs.AI', 'au:Hinton', 'all:transformer', or combinations like 'cat:cs.AI AND au:Hinton'.
sortBystringsubmittedDateSort order.
sortOrderstringdescendingDirection.
maxItemsinteger10Cap on papers fetched. arXiv API allows up to 30,000 per query but rate-limits.

Output fields

Each dataset record contains at least these fields:

FieldTypeDescription
arxivIdtextarXiv ID
titletextTitle
authorstextAuthors
categoriestextCategories
publishedAtdatePublished
pdfUrllinkPDF

Plus a scrapedAt timestamp and any source-specific extra fields visible in the All fields view of the Apify console.

Pricing

Pay-per-event:

  • Actor start — $0.0500 per event. Charged when the actor starts
  • Paper record — $0.0050 per event. Charged per paper scraped

How it works

This Actor calls the source's public web/API and parses the response. Results stream to the dataset and can be exported as JSON, CSV, Excel, or RSS via the Apify Console or API.

Use cases

  • Bulk research and lead generation
  • Feeding data warehouses, BI tools, or LLM/RAG pipelines
  • Scheduled monitoring of changes over time
  • Combining with other Apify Actors via Make, Zapier, n8n, or native Apify integrations

Compliance

Use this Actor only for purposes that comply with the source site's terms of service and applicable law. Respect rate limits. The author is not affiliated with the source.

Support

Questions, bugs, or feature requests — message the author via the Apify Console.