OpenAlex Research Paper Search
Pricing
from $2.00 / 1,000 paper fetcheds
OpenAlex Research Paper Search
Search 250M+ academic papers, journal articles & scholarly works via OpenAlex API. Filter by keyword, publication year, citation count & open access. Returns authors, affiliations, DOI, concepts. Free, no API key.
Pricing
from $2.00 / 1,000 paper fetcheds
Rating
0.0
(0)
Developer

ryan clinton
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
Search and extract structured data from over 250 million academic papers, journal articles, and scholarly works using the OpenAlex open database. Filter by keyword, publication year, citation count, and open access status -- no API key required, completely free to query.
OpenAlex Research Paper Search is an Apify actor that queries the OpenAlex open scholarly database to find academic research papers, journal articles, conference proceedings, and other scholarly works. OpenAlex indexes over 250 million works from tens of thousands of publishers and repositories worldwide, making it one of the most comprehensive open academic databases available. The actor performs full-text search across titles, abstracts, and paper content, then returns clean, structured JSON for each result -- including authors, institutional affiliations, journal information, citation counts, DOIs, open access URLs, and top research concepts.
Because OpenAlex is entirely free with no authentication, this actor requires zero setup. Provide your search query and optional filters, and get publication-ready academic data in seconds. Whether you are conducting a literature review, tracking citation trends, monitoring new research in your field, or building data pipelines for bibliometric analysis, this actor delivers the structured scholarly data you need at scale.
Why use OpenAlex Research Paper Search?
- No API key or account needed -- OpenAlex is a free, open scholarly database. This actor handles all API communication, pagination, and data normalization out of the box.
- Structured, analysis-ready output -- Raw API responses are cleaned and transformed into a consistent JSON schema with author names, affiliations, journal details, citation metrics, DOIs, and open access links.
- Scalable extraction -- Retrieve up to 10,000 papers per run with automatic multi-page pagination. The actor manages rate limiting and page fetching transparently.
- Cloud execution with scheduling -- Run on Apify infrastructure without installing anything locally. Schedule recurring searches to monitor new publications weekly or monthly.
- Seamless integrations -- Output feeds directly into Google Sheets, Slack, Zapier, Make, webhooks, and other downstream tools via Apify's built-in integration ecosystem.
- Extremely cost-effective -- Runs on just 256 MB memory and completes in seconds. The underlying OpenAlex API is free, so you only pay minimal Apify compute costs.
Key features
- Full-text search across titles, abstracts, and full text of 250M+ scholarly works
- Publication year filter to focus results on a specific year (e.g., only 2025 papers)
- Citation threshold filter to surface only high-impact, highly-cited research above a minimum count
- Open access filter to return only freely available papers with direct PDF/download URLs
- Flexible sorting by relevance score, citation count, or publication date
- Rich metadata extraction -- authors, institutional affiliations, journal name, publisher, DOI, open access URL, and top 5 research concepts per paper
- Configurable result limits from 1 to 10,000 papers per run
- Automatic pagination with 200 results per API page for efficient large-scale collection
- Concept tagging -- each paper includes the top 5 OpenAlex concepts ranked by relevance score
- Deduplication of affiliations -- institutional affiliations are extracted and deduplicated across all authors
How to use
Apify Console
- Go to the OpenAlex Research Paper Search actor page on Apify.
- Click Start to open the input configuration.
- Enter your Search Query -- this searches across titles, abstracts, and full text.
- Optionally set Publication Year, Minimum Citations, or enable Open Access Only.
- Choose a Sort By option: Relevance (default), Most Cited, or Most Recent.
- Set Max Results to control how many papers to retrieve (default: 100).
- Click Start and wait for the run to complete.
- Download results from the Dataset tab in JSON, CSV, or Excel format.
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("kbV7IqCW7tszfXB96").call(run_input={"searchQuery": "CRISPR gene editing","publicationYear": 2024,"minCitations": 10,"openAccessOnly": True,"sortBy": "cited_by_count:desc","maxResults": 200})for paper in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{paper['title']} ({paper['citedByCount']} citations)")
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const run = await client.actor("kbV7IqCW7tszfXB96").call({searchQuery: "CRISPR gene editing",publicationYear: 2024,minCitations: 10,openAccessOnly: true,sortBy: "cited_by_count:desc",maxResults: 200,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach((paper) => {console.log(`${paper.title} (${paper.citedByCount} citations)`);});
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
searchQuery | String | Yes | -- | Keyword search across titles, abstracts, and full text. Example: "machine learning healthcare" |
publicationYear | Integer | No | -- | Filter results to a specific publication year. Example: 2024 |
minCitations | Integer | No | -- | Only include papers with at least this many citations. Example: 50 |
openAccessOnly | Boolean | No | false | When enabled, returns only papers that are freely available as open access |
sortBy | String | No | relevance_score:desc | Sort order: relevance_score:desc, cited_by_count:desc, or publication_date:desc |
maxResults | Integer | No | 100 | Maximum number of papers to return, between 1 and 10,000 |
Example input (JSON)
{"searchQuery": "transformer neural network architecture","publicationYear": 2023,"minCitations": 25,"openAccessOnly": true,"sortBy": "cited_by_count:desc","maxResults": 500}
Tips for input configuration
- Multi-word queries like
"deep learning medical imaging"return more relevant results than single broad keywords. - Combine filters -- use publication year and minimum citations together to find high-impact recent papers.
- Start small -- begin with 100 results to validate your query, then increase
maxResultsfor comprehensive collection. - Sort by citations when exploring a new field to identify landmark papers and key references first.
Output
Each paper in the output dataset contains 14 structured fields:
{"openAlexId": "https://openalex.org/W2741809807","doi": "https://doi.org/10.1038/s41586-021-03819-2","title": "Highly accurate protein structure prediction with AlphaFold","publicationYear": 2021,"citedByCount": 18542,"type": "article","authors": ["John Jumper","Richard Evans","Alexander Pritzel","Tim Green","Michael Figurnov"],"authorAffiliations": ["DeepMind Technologies","European Molecular Biology Laboratory"],"journalName": "Nature","publisherName": "Springer Nature","isOpenAccess": true,"oaUrl": "https://www.nature.com/articles/s41586-021-03819-2.pdf","concepts": ["Protein structure prediction","Computational biology","Artificial intelligence","Deep learning","Structural biology"],"extractedAt": "2026-02-19T14:30:00.000Z"}
Output fields reference
| Field | Type | Description |
|---|---|---|
openAlexId | String | Unique OpenAlex identifier URL for the work |
doi | String/null | Digital Object Identifier URL, if available |
title | String | Full title of the paper |
publicationYear | Integer | Year the paper was published |
citedByCount | Integer | Total number of citations recorded in OpenAlex |
type | String | Work type (e.g., article, book-chapter, dissertation, preprint) |
authors | String[] | List of author display names |
authorAffiliations | String[] | Deduplicated list of institutional affiliations across all authors |
journalName | String/null | Name of the journal or venue |
publisherName | String/null | Name of the publisher organization |
isOpenAccess | Boolean | Whether the paper is freely available |
oaUrl | String/null | Direct URL to the open access version, if available |
concepts | String[] | Top 5 OpenAlex concepts ranked by relevance score |
extractedAt | String | ISO 8601 timestamp of when the data was extracted |
Use cases
- Systematic literature reviews -- Search and collect thousands of papers on a specific topic with structured metadata for analysis in tools like Zotero, Mendeley, or custom databases.
- Bibliometric analysis -- Track citation counts, identify top-cited papers, and analyze publication trends across years, journals, and institutions.
- Research trend monitoring -- Schedule recurring runs to detect new publications in your field and receive alerts when papers matching your criteria appear.
- Grant writing and proposals -- Quickly gather evidence of the research landscape in a specific area, including key authors, institutions, and publication volumes.
- Academic data pipelines -- Feed structured paper data into data warehouses, dashboards, or machine learning models for research intelligence applications.
- Open access discovery -- Filter for open access papers to build reading lists of freely available research without journal subscription barriers.
- Competitive intelligence for R&D -- Monitor what competitors, partner institutions, or specific research groups are publishing and how often their work is cited.
- Citation network analysis -- Use citation counts and concept tags to map relationships between research topics and identify emerging interdisciplinary fields.
- Course material curation -- Educators can search for highly-cited papers on specific topics to build reading lists and course bibliographies.
- Journalism and science communication -- Reporters can quickly find authoritative, highly-cited sources on scientific topics for fact-checking and story research.
API & integration
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("kbV7IqCW7tszfXB96").call(run_input={"searchQuery": "climate change mitigation strategies","minCitations": 100,"sortBy": "cited_by_count:desc","maxResults": 50})dataset = client.dataset(run["defaultDatasetId"])for paper in dataset.iterate_items():authors = ", ".join(paper["authors"][:3])print(f"[{paper['citedByCount']}] {paper['title']} -- {authors}")
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const run = await client.actor("kbV7IqCW7tszfXB96").call({searchQuery: "climate change mitigation strategies",minCitations: 100,sortBy: "cited_by_count:desc",maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach((paper) => {const authors = paper.authors.slice(0, 3).join(", ");console.log(`[${paper.citedByCount}] ${paper.title} -- ${authors}`);});
cURL
curl "https://api.apify.com/v2/acts/kbV7IqCW7tszfXB96/runs" \-X POST \-H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_API_TOKEN" \-d '{"searchQuery": "climate change mitigation strategies","minCitations": 100,"sortBy": "cited_by_count:desc","maxResults": 50}'
Available integrations
- Google Sheets -- Export paper data to spreadsheets for collaborative literature review
- Slack / Email -- Get notifications when new papers matching your criteria are published
- Zapier / Make -- Route results into custom workflows for research tracking
- Webhooks -- Push data to your own API endpoints for automated processing
- Amazon S3 / Google Cloud Storage -- Archive results to cloud storage for long-term analysis
How it works
- Input validation -- The actor validates the search query and configuration parameters.
- URL construction -- Builds the OpenAlex API request URL with search terms, filters (year, citations, open access), sort order, and field selection.
- API pagination -- Fetches results in pages of 200 using the OpenAlex polite pool (mailto parameter) for reliable throughput.
- Data transformation -- Each raw API result is cleaned and normalized: authorships are flattened to author name arrays, institutional affiliations are extracted and deduplicated, top 5 concepts are selected by score, and journal/publisher metadata is extracted from primary location data.
- Incremental storage -- Transformed papers are pushed to the Apify dataset in batches as each page completes.
- Completion -- The actor stops when it reaches the requested maximum results, exhausts available matches, or hits the 10,000-result API limit.
Input Query + Filters|v[Build API URL]|v[Fetch Page 1..N] -----> OpenAlex API (200 per page)|v[Transform Results]- Flatten authors- Deduplicate affiliations- Extract top 5 concepts- Map journal/publisher|v[Push to Dataset] -----> JSON / CSV / Excel|v[Next Page or Done]
Performance & cost
The actor runs on 256 MB memory and calls a free API, making it extremely cost-effective.
| Scenario | Papers | Approx. Duration | Approx. Cost |
|---|---|---|---|
| Quick search | 100 | 5-10 seconds | < $0.01 |
| Medium batch | 500 | 15-20 seconds | ~$0.01 |
| Large batch | 1,000 | 20-40 seconds | ~$0.01 |
| Full extraction | 5,000 | 2-4 minutes | ~$0.02-0.03 |
| Maximum run | 10,000 | 4-8 minutes | ~$0.03-0.05 |
Costs reflect Apify platform compute charges only. The OpenAlex API itself is completely free. Actual costs depend on your Apify subscription plan and current platform pricing.
Limitations
- 10,000 results maximum -- The OpenAlex API limits paginated access to 10,000 results per query. For larger datasets, split searches by publication year or use multiple targeted queries.
- No author-specific search -- The search query matches across titles, abstracts, and full text. There is no dedicated author name filter; include author names in the search query for approximate matching.
- Single year filter -- Publication year filtering supports a single year only, not date ranges. Run separate queries for multi-year analysis.
- Citation counts are cumulative -- Citation numbers reflect totals at the time of the API call and may differ from counts on publisher websites due to different data sources.
- Concept coverage varies -- OpenAlex assigns concepts algorithmically. Newer or niche papers may have fewer or less precise concept tags.
- Rate limiting -- While the actor uses the OpenAlex polite pool for higher throughput, very rapid successive runs may encounter temporary rate limits from the API.
- English-centric search -- The full-text search works best with English-language queries. Non-English papers are indexed but search relevance may vary.
Responsible use
- Respect OpenAlex terms of service -- This actor accesses a free, open API. Use reasonable request volumes and avoid unnecessary repeated extraction of the same data.
- Cite sources properly -- When using extracted paper data in research, publications, or applications, cite the original papers using the DOI links provided in the output.
- Do not misrepresent data -- Citation counts and metadata are snapshots at extraction time. Do not present them as definitive or real-time metrics without noting the extraction date.
- Schedule responsibly -- When setting up recurring runs, use reasonable intervals (daily or weekly) rather than continuous polling to be a good citizen of the OpenAlex API ecosystem.
- Comply with copyright -- This actor extracts metadata about scholarly works, not full-text content. Accessing full papers via open access URLs is subject to each publisher's terms.
FAQ
Do I need an API key to use this actor? No. OpenAlex is a free, open academic database that requires no API key or authentication. This actor works immediately with no additional setup.
How current is the OpenAlex data? OpenAlex updates its index continuously and typically includes new papers within days of publication. The database covers works from the 1600s through the present, with the most comprehensive coverage of modern scholarly literature.
What types of academic works does this actor find?
OpenAlex indexes journal articles, conference papers, book chapters, dissertations, preprints, datasets, and other scholarly work types. The work type appears in the type field of each output record.
Can I search for papers by a specific author? Include the author's name in the search query for approximate matching. For more targeted author-based searches, consider using the OpenAlex Research Papers actor.
What is the maximum number of results per run? You can retrieve up to 10,000 papers per run. For larger datasets, run multiple searches with different filter combinations, such as splitting by publication year.
How are concepts assigned to papers? OpenAlex uses an automated algorithm to tag each paper with relevant concepts and a confidence score. This actor returns the top 5 concepts by score for each paper.
Can I filter by journal or publisher?
Not directly through this actor's input. However, you can post-process the output dataset to filter by journalName or publisherName fields.
How do I get only open access papers?
Set the openAccessOnly input parameter to true. The output will include only papers where isOpenAccess is true, and the oaUrl field will contain a direct link to the freely available version.
Can I schedule this actor to run automatically? Yes. Use Apify's built-in scheduling to run the actor daily, weekly, or at any custom interval. Combine with Slack or email integrations to receive notifications when new papers are found.
How accurate are the citation counts? Citation counts come from the OpenAlex database, which aggregates data from multiple sources. They are generally reliable for comparative analysis but may differ from counts on Google Scholar, Scopus, or Web of Science due to different coverage and update frequencies.
What happens if my search returns no results? The actor will complete successfully with an empty dataset. Try broadening your search query, removing filters, or checking for typos in your search terms.
Can I export results to CSV or Excel? Yes. After the run completes, go to the Dataset tab in Apify Console and download in JSON, CSV, Excel, XML, or RSS format. You can also access the data programmatically via the Apify API.
Related actors
| Actor | Description |
|---|---|
| OpenAlex Research Papers | Alternative OpenAlex actor with additional search and filtering options |
| PubMed Biomedical Literature Search | Search biomedical and life science papers via the PubMed/NCBI database |
| Semantic Scholar Paper Search | Search papers using Semantic Scholar's AI-powered academic knowledge graph |
| Crossref Academic Paper Search | Search scholarly metadata via the Crossref DOI registry |
| ArXiv Preprint Paper Search | Search preprints on arXiv across physics, mathematics, computer science, and more |
| CORE Open Access Papers | Search millions of open access research papers from repositories worldwide |
| DBLP Publication Search | Search computer science publications from the DBLP bibliography database |
| Europe PMC Literature Search | Search European biomedical and life science literature via Europe PMC |