OpenAlex Works Scraper — Academic Publications API avatar

OpenAlex Works Scraper — Academic Publications API

Pricing

from $3.00 / 1,000 results

Go to Apify Store
OpenAlex Works Scraper — Academic Publications API

OpenAlex Works Scraper — Academic Publications API

Extract academic publications from OpenAlex API. Filter by publication year, author ORCID, institution, concept, and open access status. Returns structured bibliographic data including authors, citations, and abstracts.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Compute Edge

Compute Edge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

OpenAlex Works Scraper

About this Actor

The OpenAlex Works Scraper extracts academic publications from the OpenAlex API — the world's largest open bibliography with over 250 million scholarly works. This Actor lets you search, filter, and download bibliographic data including titles, DOIs, authors, citations, and abstract content for research analysis, literature reviews, institutional reporting, and academic data mining.

Key Features

  • Full-text Search: Find works by keywords (e.g., "machine learning", "climate change")
  • Advanced Filtering: Filter by publication year, author ORCID, research institution, research concept, and open access status
  • Structured Output: Extract titles, DOIs, publication years, citation counts, authors, institutions, and abstracts
  • Citation Analytics: Get citation counts and open access availability for impact assessment
  • Author & Institution Data: Export author names, ORCIDs, institutional affiliations, and author counts
  • No Login Required: Uses the public OpenAlex API — zero credentials needed
  • Rate Limit Optimization: Optional polite pool email for higher API rate limits

Data Fields Extracted

FieldTypeDescription
idstringOpenAlex work ID (e.g., W2741809807)
titlestringWork title
doistringDigital Object Identifier (if available)
publication_yearnumberYear of publication
publication_datestringFull publication date (YYYY-MM-DD)
typestringWork type (e.g., journal-article, conference-paper)
cited_by_countnumberTotal citations from other works
is_oabooleanOpen access status (true/false)
oa_urlstringURL to open access PDF (if available)
authorsstringComma-separated list of author names
institutionsstringComma-separated list of affiliated institutions
source_display_namestringJournal or publication venue name
primary_topicstringPrimary research topic/concept
abstractstringWork abstract (reconstructed from OpenAlex inverted index)

How to Scrape Academic Works with OpenAlex

Step 1: Basic Search (No Filters)

The simplest way to use this Actor: run it with no filters to get recent open access publications.

  1. Click "Use" to open the input form
  2. Leave all fields blank (defaults to all recent works, limited to 200 results)
  3. Click "Start"
  4. Results appear in the Dataset tab

Input form with defaults

Step 2: Search by Keyword

  1. In the "Search Query" field, enter your research topic (e.g., "quantum computing")
  2. Leave other filters blank
  3. Click "Start"
  4. Actor returns all works matching your query (up to max results limit)

Examples:

  • "renewable energy" — Find renewable energy research
  • COVID-19 vaccine — Find pandemic vaccine studies
  • machine learning oncology — Combine keywords for specific topics

Step 3: Filter by Publication Year

  1. Enter search query (or leave blank for all recent works)
  2. In "Filter by Publication Year" field, enter:
    • Single year: 2024
    • Year range: 2020-2024
  3. Click "Start"

Step 4: Filter by Author (ORCID)

Find all works by a specific author using their ORCID (Open Researcher and Contributor ID):

  1. Go to orcid.org and search for the author
  2. Copy their ORCID (e.g., 0000-0002-1234-5678)
  3. Paste into "Filter by Author ORCID" field
  4. Click "Start"

Step 5: Filter by Institution

Extract all publications from a specific research institution:

  1. Find the institution's ROR ID at ror.org (e.g., 02nr0ka47 for MIT)
  2. Paste into "Filter by Institution ID" field
  3. Click "Start"

Step 6: Filter by Research Concept

Find works tagged with a specific research topic:

  1. In "Filter by Concept" field, enter a research area (e.g., "Machine Learning", "Cancer Research")
  2. Click "Start"

Step 7: Open Access Only

To download only freely available publications:

  1. Enable "Open Access Only" toggle
  2. Fill in any search/filter fields as above
  3. Click "Start"

Step 8: Advanced Query (Multiple Filters)

Combine filters for precise queries:

  1. Search Query: cancer immunotherapy
  2. Publication Year: 2023-2024
  3. Concept: Immunology
  4. Open Access Only: Enabled
  5. Click "Start"

Result: Recent open access immunology papers on cancer immunotherapy with 100+ citations.

Step 9: Increase Rate Limits (Optional)

For large queries, add your email to access the OpenAlex polite pool:

  1. Enter your email in "Polite Pool Email" field
  2. This increases your API rate limit from 100 to 10,000 requests/second
  3. Useful for extracting 5,000+ works in a single run

Pricing

Cost per work extracted: $0.004/result

  • A 1,000-work dataset costs ~$4.00 (plus Apify compute credits)
  • Includes Actor start fee ($0.00005) + per-result pricing
  • No hidden fees — you only pay for results actually returned
  • OpenAlex API itself is free; cost is only for Apify compute and Actor markup

Compute efficiency:

  • Average run: 50-200 compute units per 100 works
  • Default memory limit: 512 MB (sufficient for up to 10,000 works per run)
  • Typical 200-work run completes in 30-60 seconds

Input / Output Example

Input (JSON)

{
"search": "machine learning",
"filterPublicationYear": "2023-2024",
"filterConcept": "Artificial Intelligence",
"isOpenAccess": true,
"maxResults": 500,
"politeEmail": "researcher@university.edu"
}

Output (First 2 records)

[
{
"id": "W3145678901",
"title": "Transformers in Natural Language Processing: A Survey",
"doi": "10.1234/nlp.2024.001",
"publication_year": 2024,
"publication_date": "2024-03-15",
"type": "journal-article",
"cited_by_count": 284,
"is_oa": true,
"oa_url": "https://example.com/paper.pdf",
"authors": "Jane Smith, John Doe, Alice Johnson",
"institutions": "MIT, Stanford University, Harvard University",
"source_display_name": "Nature Machine Intelligence",
"primary_topic": "Machine Learning",
"abstract": "This paper surveys recent advances in transformer architectures..."
},
{
"id": "W3145678902",
"title": "Efficient Vision Transformers for Mobile Devices",
"doi": "10.1234/cv.2024.002",
"publication_year": 2024,
"publication_date": "2024-02-01",
"type": "conference-paper",
"cited_by_count": 156,
"is_oa": true,
"oa_url": "https://example.com/paper2.pdf",
"authors": "Prof. David Chen, Emma Wilson",
"institutions": "UC Berkeley, Google Research",
"source_display_name": "CVPR 2024",
"primary_topic": "Computer Vision",
"abstract": "Mobile deployment of vision transformers requires careful optimization..."
}
]

FAQ

Q: Do I need an OpenAlex account?
A: No. The OpenAlex API is completely free and requires no login. This Actor uses only the public API.

Q: What's the maximum number of results I can extract?
A: The maxResults field accepts 1–10,000. For queries returning millions of works, OpenAlex applies pagination; set maxResults to 10,000 for a large sample.

Q: How do I find an author's ORCID?
A: Visit orcid.org, search for the author name, and copy the ORCID from their profile (format: XXXX-XXXX-XXXX-XXXX).

Q: How do I find an institution's ROR ID?
A: Visit ror.org, search for the institution, and copy the ROR ID from the result.

Q: Can I export the data to Excel or CSV?
A: Yes. After the run completes, click the "Download" button in the Dataset tab to export as CSV or JSON.

Q: Why are some abstracts empty?
A: Not all OpenAlex works have abstracts. Open access papers are more likely to include abstracts. Enable the "Open Access Only" filter to maximize abstract coverage.

Q: What if my query returns 0 results?
A: Try broadening your filters:

  • Remove year restrictions
  • Check spelling of author ORCID or concept name
  • Use fewer combined filters
  • Test your search term on openalex.org directly

Q: How often is OpenAlex data updated?
A: OpenAlex updates several times per week. Most recent publications appear within 1–3 weeks of publication.

Q: Can I use this data commercially?
A: Yes. OpenAlex data is CC0 (public domain). All extracted works and metadata are yours to use. No attribution required (but recommended as a best practice).

This Actor scrapes the public, free OpenAlex API — a nonprofit, open-access bibliography maintained by the University of Chicago. It does not:

  • Use authentication or bypass any protections
  • Extract personal data or emails
  • Violate any terms of service
  • Require user credentials

OpenAlex Terms: By using OpenAlex, you agree to the OpenAlex Terms of Service. Data is provided under CC0 1.0 Universal (Public Domain Dedication).

Responsible Use: Please respect OpenAlex's rate limits. Join the polite pool (via email) for high-volume queries to avoid temporary blocks.


Need help? Open an issue on the Apify Actor page or email support@apify.com.

Last updated: May 2026