Pricing

Pay per event

Go to Apify Store

OpenAlex Scraper

Try for free

Extract research papers from OpenAlex — titles, authors, citations, institutions, and open access links.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

4 days ago

Last modified

OpenAlex Academic Papers Scraper

Search OpenAlex — the world's largest open catalog of academic research — and extract structured data for papers, authors, citations, and institutions across 250M+ works.

What does OpenAlex Academic Papers Scraper do?

This actor searches the OpenAlex database and returns detailed metadata for academic research papers. OpenAlex indexes over 250 million scholarly works from all fields of research. For each paper, it extracts:

Bibliographic data: title, DOI, publication date, journal, volume, issue, pages
Author details: names, ORCID IDs, institutional affiliations, country codes
Citation metrics: cited-by count, number of references
Open access: OA status, free PDF links, license information
Abstracts: full abstract text reconstructed from OpenAlex inverted index
Topics & keywords: research topics and keyword classifications
Source metadata: journal name, ISSN, publisher

Why use OpenAlex Academic Papers Scraper?

250M+ works — the largest open academic database, successor to Microsoft Academic Graph
No API key needed — OpenAlex is completely free and open
Rich filtering — filter by year, citation count, and open access status
Full abstracts — reconstructed from OpenAlex's inverted index format
Citation sorting — find the most influential papers in any field
Author & institution data — ORCID IDs and institutional affiliations included
Structured output — clean JSON ready for analysis or integration

Use cases

Literature reviews: Find the most cited papers on any research topic
Research trend analysis: Track publication volume and citation patterns over time
Academic evaluation: Analyze citation impact for researchers and institutions
Competitive intelligence: Monitor competitor research output and focus areas
Patent analysis: Cross-reference academic publications with patent portfolios
Grant applications: Support funding proposals with citation and impact data
Pharma R&D: Systematic literature reviews for drug development
VC due diligence: Evaluate research depth behind deep-tech startups

Who is it for?

Academic researchers who need repeatable literature-search exports for reviews, grant proposals, and citation analysis.
R&D and pharma teams monitoring new work around diseases, compounds, or research methods.
VC and competitive-intelligence analysts validating the research depth behind universities, startups, and technology categories.
Data teams building enrichment pipelines that need DOI, citation, author, institution, and open-access metadata without manual searching.
AI and knowledge-base builders collecting structured paper metadata and abstracts for downstream summarization or retrieval workflows.

How to scrape academic papers from OpenAlex

Go to OpenAlex Academic Papers Scraper on Apify Store.
Enter one or more search queries (e.g., "machine learning", "CRISPR gene editing").
Optionally filter by publication year, minimum citations, or open access.
Choose sort order: relevance, most cited, or newest first.
Set maximum results per query (1–500).
Click Start and download your data as JSON, CSV, or Excel.

Input parameters

Parameter	Type	Description
`searchQueries`	Array	Search terms for papers (required). Example: "machine learning", "CRISPR"
`publicationYear`	String	Year filter: single year ("2024") or range ("2020-2025")
`minCitations`	Integer	Only papers with at least this many citations (default: 0)
`openAccessOnly`	Boolean	Only return open access papers (default: false)
`sortBy`	String	Sort by "relevance", "cited_by_count", or "publication_date"
`maxResults`	Integer	Max papers per query, 1–500 (default: 20)
`openAlexApiKey`	String	Optional free OpenAlex API key to avoid anonymous search rate limits during heavy OpenAlex load

Output example

Each paper in the dataset contains these fields:

{
  "openAlexId": "W2741809807",
  "doi": "10.48550/arxiv.1706.03762",
  "title": "Attention Is All You Need",
  "publicationYear": 2025,
  "publicationDate": "2025-06-12",
  "type": "article",
  "language": "en",
  "authors": "Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones",
  "authorDetails": [
    {
      "name": "Ashish Vaswani",
      "orcid": "",
      "institution": "Google",
      "country": "US",
      "position": "first"
    }
  ],
  "journalName": "Advances in Neural Information Processing Systems",
  "journalIssn": "1049-5258",
  "publisher": "Neural Information Processing Systems Foundation",
  "volume": "30",
  "issue": "",
  "firstPage": "",
  "lastPage": "",
  "citedByCount": 6494,
  "referencedWorksCount": 38,
  "isOpenAccess": true,
  "openAccessUrl": "https://arxiv.org/pdf/1706.03762",
  "openAccessStatus": "gold",
  "abstractText": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...",
  "topics": ["Natural Language Processing Techniques", "Topic Modeling"],
  "keywords": ["attention mechanism", "transformer", "neural machine translation"],
  "openAlexUrl": "https://openalex.org/W2741809807",
  "doiUrl": "https://doi.org/10.48550/arxiv.1706.03762",
  "pdfUrl": "https://arxiv.org/pdf/1706.03762",
  "searchQuery": "transformer attention mechanism",
  "relevanceScore": 0.95,
  "scrapedAt": "2026-03-03T12:00:00.000Z"
}

How much does it cost to scrape OpenAlex?

OpenAlex Academic Papers Scraper uses a pay-per-event pricing model:

Event	Price
Run started	$0.001
Per paper extracted	$0.001

Cost examples:

50 papers: $0.001 + (50 × $0.001) = $0.051
200 papers: $0.001 + (200 × $0.001) = $0.201
500 papers: $0.001 + (500 × $0.001) = $0.501

API usage

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('automation-lab/openalex-scraper').call({
    searchQueries: ['machine learning'],
    publicationYear: '2023-2025',
    minCitations: 100,
    sortBy: 'cited_by_count',
    maxResults: 50,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(paper => {
    console.log(`${paper.citedByCount} cites | ${paper.title}`);
});

Python

from apify_client import ApifyClient

client = ApifyClient('YOUR_API_TOKEN')

run = client.actor('automation-lab/openalex-scraper').call(run_input={
    'searchQueries': ['machine learning'],
    'publicationYear': '2023-2025',
    'minCitations': 100,
    'sortBy': 'cited_by_count',
    'maxResults': 50,
})

dataset = client.dataset(run['defaultDatasetId']).list_items().items
for paper in dataset:
    print(f"{paper['citedByCount']} cites | {paper['title']}")

Integrations

Connect OpenAlex Scraper with other tools using Apify integrations:

Google Sheets — Export citation data to spreadsheets for analysis
Slack / Email — Get alerts when new papers match your search criteria
Webhooks — Trigger downstream processing when extraction completes
Zapier / Make — Connect to 5,000+ apps for automated research workflows
Amazon S3 / Google Cloud — Archive large literature datasets

Tips and best practices

Use specific search terms — "transformer attention mechanism" returns more relevant results than just "AI"
Filter by citations — set minCitations to find influential, well-cited papers
Year ranges — use "2020-2025" format to focus on recent research
Open access filter — enable openAccessOnly to get papers with free PDF downloads
Sort by citations — "cited_by_count" surfaces the most impactful papers first
Multiple queries — search for multiple topics in a single run to compare across fields
Abstracts — OpenAlex stores abstracts as inverted indexes; this actor reconstructs the full text automatically

Data source

All data comes from OpenAlex, a free and open catalog of the world's scholarly research. OpenAlex indexes over 250 million works from journals, conference proceedings, preprints, and other academic sources. Data is updated daily and available under CC0 (public domain).

Use with AI agents via MCP

OpenAlex Scraper is available as a tool for AI assistants via the Model Context Protocol (MCP).

Setup for Claude Code

$claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/openalex-scraper"

Setup for Claude Desktop, Cursor, or VS Code

Add this to your MCP config file:

{
    "mcpServers": {
        "apify": {
            "url": "https://mcp.apify.com?tools=automation-lab/openalex-scraper"
        }
    }
}

Example prompts

"Search OpenAlex for climate change research papers"
"Get papers by this author from OpenAlex"
"Find the top-cited open access papers about quantum computing from the last 3 years"

cURL

curl -X POST "https://api.apify.com/v2/acts/automation-lab~openalex-scraper/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "searchQueries": ["machine learning"],
    "sortBy": "cited_by_count",
    "maxResults": 50
  }'

Legality

Scraping publicly available data is generally legal according to the US Court of Appeals ruling (HiQ Labs v. LinkedIn). This actor only accesses publicly available information and does not require authentication. Always review and comply with the target website's Terms of Service before scraping. For personal data, ensure compliance with GDPR, CCPA, and other applicable privacy regulations.

FAQ

Q: How does OpenAlex compare to Google Scholar? A: OpenAlex provides structured API access to 250M+ works with DOIs, ORCID IDs, and institutional data. Google Scholar has broader web coverage but no structured API.

Q: Are abstracts always available? A: Not all papers have abstracts in OpenAlex. The abstractInvertedIndex field indicates whether one is available.

Q: Can I search for specific authors? A: Use the author's name in your search query. OpenAlex's full-text search includes author names.

Q: What is the "relevance score"? A: A score from 0 to 1 indicating how well the paper matches your search query, calculated by OpenAlex's search engine.

Q: Why are some abstracts empty or garbled? A: OpenAlex stores abstracts as inverted indexes (word-position maps). This actor reconstructs the full text automatically, but a small number of papers have malformed indexes that produce incomplete abstracts. If the abstract is critical, check the paper via its DOI link.

Q: The scraper returns 0 results for my search query. A: OpenAlex's search is case-insensitive but sensitive to special characters. Remove quotes, parentheses, and special characters from your query. Also verify your publicationYear filter is not excluding all results.

arXiv Scraper — scrape preprint papers from arXiv
CrossRef Scraper — extract scholarly article metadata via CrossRef
ClinicalTrials Scraper — extract clinical trial data from ClinicalTrials.gov
NASA Images Scraper — search and extract NASA images with full metadata
OpenFDA Scraper — extract FDA drug adverse event reports
Open Food Facts Scraper — scrape food product nutrition data

OpenAlex Scraper

crawlerbros/openalex-scraper

Scrape OpenAlex the free, open catalog of 250M+ scholarly works, authors, institutions, and concepts. Search papers, authors, or fetch by OpenAlex ID / DOI. Pulls citations, open-access status, abstracts, authorships, journals, topics, and more.

Crawler Bros

OpenAlex Academic Research Scraper - Scholarly Papers

cloud9_ai/openalex-scraper

Search and extract academic papers, authors, institutions, and research topics from OpenAlex. Free open API covering 250M+ scholarly works. Get citations, abstracts, open access URLs.

cloud9

Openalex Scraper

fortuitous_pirate/openalex-scraper

Scrape open-access research from OpenAlex: 250M+ scholarly works, authors, institutions, and concepts. Fully free, no API key required.

Fortuitous Pirate

OpenAlex Scraper – Cheap 📚🪄✨

scrapestorm/openalex-scraper---cheap

🔍 Easily extract OpenAlex research Collect structured academic data from OpenAlex, including publication titles, authors, institutions, sources, years, citations, funding details, & entity URLs 📚📊 Ideal for bibliometric analysis, research intelligence, funding analysis & academic insights 🌍🧠

Storm_Scraper

📚 OpenAlex Scraper - Academic Papers & Citation Data

benthepythondev/openalex-scraper

OpenAlex Scraper to search 250M+ academic papers via the free OpenAlex API. Extract title, authors, institutions, year, venue, DOI, citation count, open-access status, concepts and PDF links. Filter by year and open access. For literature reviews, citation analysis and AI/RAG datasets.

ben

Academic Research & Papers Scraper (OpenAlex)

rupom888/academic-research-scraper

Search 200M+ academic papers, researchers, and institutions via OpenAlex API. Completely free, no API key needed. Get paper titles, abstracts, DOIs, citations, authors, open access links, and concepts. Filter by year, paper type, open access, and field of study.

Syed Rupom

OpenAlex Scraper - Research Papers, Citations & Authors API

themineworks/openalex-scholarly-works

Scrape 250M+ scholarly papers from OpenAlex as clean JSON. Filter by topic, year, citations, open-access & type. Get authors, venues, abstracts. No API key. Use in Claude, ChatGPT & any MCP agent for literature reviews & RAG.

The Mine Works

OpenAlex Scraper

gio21/openalex-scraper

Scrape OpenAlex - the free open catalog of scholarly works (250M+ papers, 100M+ authors, 100K institutions). Search across works, authors, institutions, concepts, journals. Returns title, abstract, authors, citations, DOI, OA status, and more.

Gio

OpenAlex Academic Paper Search

lulzasaur/openalex-scraper

Search and retrieve academic papers, authors, and institutions from OpenAlex. Get citations, DOIs, abstracts, and publication data for 250M+ scholarly works.

lulz bot

OpenAlex Scraper - Scholarly Works, Authors & Citations Graph

jungle_synthesizer/openalex-works-crawler

Scrape OpenAlex, the open scholarly graph with 250M+ works, 100M+ authors, and 120K+ institutions. Extract titles, abstracts, authors, ORCIDs, institutions, concepts, citations, open-access flags, and grants.