Arxiv Paper Search avatar

Arxiv Paper Search

Pricing

Pay per usage

Go to Apify Store
Arxiv Paper Search

Arxiv Paper Search

Arxiv Paper Search. Search and discover data across multiple sources with structured output. Fast, reliable, and cost-effective.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny

Donny

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 hours ago

Last modified

Categories

Share

What it does

ArXiv Paper Search allows you to programmatically search and extract research papers from ArXiv, the world's largest open-access preprint repository. The actor queries the ArXiv API to retrieve paper metadata including titles, author lists, abstracts, publication dates, subject categories, PDF links, and DOI identifiers. You can search by topic keywords, filter by author name, and control how results are sorted. This makes it easy to build datasets of academic research papers for analysis, monitoring, or integration into other workflows.

Why use it

Keeping up with the rapidly growing volume of academic research is a significant challenge for researchers, data scientists, and technology professionals. This actor automates the discovery process by letting you programmatically query ArXiv and receive structured, machine-readable results. Whether you need to track new publications in a specific field, monitor a particular author's output, or build a comprehensive bibliography for a literature review, this actor eliminates the manual effort of browsing and copying data from ArXiv's web interface.

How it works

  1. The actor builds a search query from the provided keywords and optional author name filter.
  2. It sends the query to the ArXiv API at https://export.arxiv.org/api/query with the specified sort order and result limit.
  3. The XML response from ArXiv is parsed to extract individual paper entries.
  4. For each paper, it extracts the title, authors, abstract, publication date, subject categories, PDF URL, and DOI.
  5. All extracted data is structured and pushed to the Apify dataset for download or further processing.

Input parameters

ParameterTypeDefaultDescription
searchQueryStringlarge language modelsSearch term for finding papers
authorNameString(empty)Optional filter by author name
maxResultsInteger25Maximum number of papers to return (1-100)
sortByStringrelevanceSort order: relevance, lastUpdatedDate, or submittedDate

Output fields

FieldTypeDescription
titleStringPaper title
authorsStringComma-separated list of authors
abstractStringPaper abstract
publishedDateStringPublication date
categoriesArrayArXiv subject categories
pdfUrlStringDirect link to PDF download
doiStringDOI identifier if available

Cost estimate

This actor uses only API calls with no browser rendering, making it extremely cost-efficient. A typical run costs under $0.001 in platform credits. The default 512 MB memory is sufficient for all queries.

Tips

  • Use ArXiv category codes in your search (e.g., "cs.AI" for artificial intelligence, "cs.CL" for computational linguistics) for precise filtering.
  • Set sortBy to submittedDate to find the most recent papers in your field.
  • Schedule daily runs to stay updated on new publications in your area of research.
  • Combine with Hugging Face Model Scraper to match papers with their corresponding model implementations.
  • Also check out OpenAI Status Monitor for monitoring AI service availability.