CORE Open Access Paper Search
Pricing
from $2.00 / 1,000 paper fetcheds
CORE Open Access Paper Search
Search 300M+ open access academic papers via CORE API. Find research papers by keywords, year range & language. Extract titles, authors, abstracts, DOIs, citation counts, journal names, fields of study & PDF download links. Ideal for literature reviews & research monitoring.
Pricing
from $2.00 / 1,000 paper fetcheds
Rating
0.0
(0)
Developer

ryan clinton
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
4 hours ago
Last modified
Categories
Share
Search and extract open access academic papers from CORE -- the world's largest aggregator of open access research with over 300 million metadata records and 40+ million full-text papers. Filter by keyword, year range, and language. Optionally restrict results to papers with downloadable full-text PDFs. Requires a free API key from core.ac.uk.
What does CORE Open Access Paper Search do?
CORE Open Access Paper Search is an Apify actor that connects to the CORE API v3 to search and retrieve structured metadata from the world's largest collection of open access research outputs. CORE harvests content from over 10,000 institutional repositories, journal publishers, and preprint servers across the globe, providing programmatic access to more than 300 million metadata records and over 40 million full-text papers.
This actor lets you search that massive corpus by keywords, filter results by publication year range and language code, and optionally restrict output to only papers that have a downloadable full-text PDF. Each result includes 16 structured fields covering the paper title, author list, abstract, DOI, journal name, publisher, field of study, citation count, document type, language, and direct links to both the CORE page and the downloadable PDF.
The actor handles multi-page API responses automatically using offset-based pagination with built-in 200ms delays between requests to stay within CORE's usage policies. You can retrieve up to 500 papers per run.
Key capabilities:
- Search across 300M+ open access metadata records and 40M+ full-text papers
- Filter by keyword query, publication year range, and language code
- Restrict to papers with downloadable full-text PDFs only
- Get 16 structured metadata fields per paper including DOI, authors, abstract, citation count, and download URL
- Automatic pagination and rate limiting built in
- Dry-run mode when no API key is provided, with instructions on how to register for free
Why use CORE Open Access Paper Search on Apify?
Running this actor on the Apify platform gives you several advantages over calling the CORE API directly:
- No infrastructure needed. The actor runs in the cloud. No servers to manage, no dependencies to install, no pagination logic to write.
- Scheduled runs. Configure the actor to run on a daily, weekly, or custom schedule to automatically monitor new publications matching your query.
- Built-in integrations. Export results directly to Google Sheets, Slack, Zapier, Make, webhooks, or any other system through the Apify integration ecosystem.
- Scalable data collection. Retrieve up to 500 papers per run with automatic pagination across multiple API pages, all handled transparently.
- Structured output. Results come as clean, normalized JSON records ready for analysis, database import, or feeding into downstream actors and workflows.
- API and SDK access. Trigger runs and retrieve results programmatically using the Apify API or official Python and JavaScript client libraries.
- Dataset management. Store, version, and export datasets in JSON, CSV, Excel, XML, or RSS formats directly from the Apify console.
How to get a free CORE API key
This actor requires a CORE API key for live searches. The key is completely free to obtain:
- Visit https://core.ac.uk/services/api
- Click "Register" and create an account
- After registration, your API key will be available in your CORE dashboard
- Copy the key and paste it into the
apiKeyfield when configuring this actor
The free tier provides generous daily request limits that are more than sufficient for most research and data collection workflows.
If you run the actor without providing an API key, it performs a dry run -- returning a message that confirms your query configuration and explains how to register for a key. This lets you verify your input settings before committing to a live search.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
apiKey | String | No | -- | Your CORE API key. Register free at core.ac.uk/services/api. Without a key, the actor performs a dry run. |
query | String | Yes | -- | Keywords to search for in academic papers. Supports Boolean operators (AND, OR, NOT). |
yearFrom | Integer | No | -- | Filter papers published from this year onwards (e.g., 2020). |
yearTo | Integer | No | -- | Filter papers published up to and including this year (e.g., 2025). |
language | String | No | -- | ISO 639-1 language code to filter results (e.g., "en", "de", "fr", "es", "zh"). |
fullTextOnly | Boolean | No | false | When enabled, only papers with a downloadable full-text PDF are returned. |
maxResults | Integer | No | 50 | Maximum number of papers to retrieve per run (up to 500). |
Input example
{"apiKey": "YOUR_CORE_API_KEY","query": "large language models","yearFrom": 2022,"yearTo": 2025,"language": "en","fullTextOnly": true,"maxResults": 100}
Output format
Each paper in the output dataset is a JSON object with 16 fields:
| Field | Type | Description |
|---|---|---|
coreId | Number | Unique CORE identifier for the paper |
doi | String or null | Digital Object Identifier, if available |
title | String | Title of the paper |
authors | Array of Strings | List of author names |
abstract | String or null | Paper abstract text |
yearPublished | Number or null | Year of publication |
publisher | String or null | Publisher name |
journalName | String or null | Name of the journal |
downloadUrl | String or null | Direct URL to download the full-text PDF |
sourceFulltextUrls | Array of Strings | Additional URLs where the full text is available |
fieldOfStudy | String or null | Primary field of study |
citationCount | Number or null | Number of citations |
language | String or null | Language code of the paper |
documentType | String or null | Type of document (e.g., research-article, thesis) |
coreUrl | String | URL to the paper's page on core.ac.uk |
extractedAt | String | ISO 8601 timestamp of when the data was extracted |
Output example
{"coreId": 287146253,"doi": "10.1038/s41586-023-06221-2","title": "Scaling language models: Methods, analysis & insights from training Gopher","authors": ["Jack W. Rae","Sebastian Borgeaud","Trevor Cai","Katie Millican","Jordan Hoffmann"],"abstract": "Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge. This paper presents an analysis of Transformer-based language model performance across a wide range of model scales...","yearPublished": 2022,"publisher": "Nature Publishing Group","journalName": "Nature","downloadUrl": "https://core.ac.uk/download/287146253.pdf","sourceFulltextUrls": ["https://arxiv.org/pdf/2112.11446"],"fieldOfStudy": "Computer Science","citationCount": 1542,"language": "en","documentType": "research-article","coreUrl": "https://core.ac.uk/works/287146253","extractedAt": "2026-02-10T14:30:00.000Z"}
How to use CORE Open Access Paper Search
Step 1: Get your free API key
Register at core.ac.uk/services/api to obtain a free CORE API key. The registration takes less than a minute.
Step 2: Configure your search
Enter your API key, search query, and any optional filters. You can test your configuration first by leaving the API key blank -- the actor will perform a dry run and confirm your query settings without making any API calls.
Step 3: Run the actor
Click "Start" in the Apify console, or trigger the run programmatically via the API. The actor will search CORE, paginate through all matching results, and push structured paper records to the output dataset.
Step 4: Export your results
Download the dataset in JSON, CSV, Excel, XML, or RSS format. You can also connect integrations to automatically forward results to Google Sheets, Slack, Zapier, Make, or your own webhook endpoint.
How much does it cost to run?
CORE Open Access Paper Search is extremely lightweight. It makes HTTP API calls to the CORE v3 endpoint without any browser rendering, so compute costs are minimal.
| Scenario | Papers | Approximate run time | Estimated Apify cost |
|---|---|---|---|
| Quick test | 10 | 5-10 seconds | ~$0.001 |
| Standard run | 50 | 10-30 seconds | ~$0.002 |
| Medium batch | 200 | 30-60 seconds | ~$0.005 |
| Maximum run | 500 | 1-2 minutes | ~$0.01 |
- Memory usage: 256 MB RAM
- CORE API key: Free to register with generous daily request limits
- No browser required: Pure API calls keep costs extremely low
Programmatic access
You can trigger this actor and retrieve results programmatically using the Apify API or the official client libraries.
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_API_TOKEN")run_input = {"apiKey": "YOUR_CORE_API_KEY","query": "transformer neural networks","yearFrom": 2023,"yearTo": 2025,"language": "en","fullTextOnly": True,"maxResults": 100,}run = client.actor("Jh4Y6VfuSZkxkF8eq").call(run_input=run_input)for paper in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{paper['title']} ({paper['yearPublished']})")print(f" DOI: {paper['doi']}")print(f" Authors: {', '.join(paper['authors'])}")print(f" Download: {paper['downloadUrl']}")print()
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_APIFY_API_TOKEN" });const run = await client.actor("Jh4Y6VfuSZkxkF8eq").call({apiKey: "YOUR_CORE_API_KEY",query: "renewable energy storage",yearFrom: 2022,yearTo: 2025,fullTextOnly: true,maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();for (const paper of items) {console.log(`${paper.title} (${paper.yearPublished})`);console.log(` DOI: ${paper.doi}`);console.log(` Authors: ${paper.authors.join(", ")}`);console.log(` Download: ${paper.downloadUrl}`);}
cURL
curl -X POST "https://api.apify.com/v2/acts/Jh4Y6VfuSZkxkF8eq/runs?token=YOUR_APIFY_API_TOKEN" \-H "Content-Type: application/json" \-d '{"apiKey": "YOUR_CORE_API_KEY","query": "CRISPR gene editing","yearFrom": 2020,"fullTextOnly": true,"maxResults": 50}'
Tips for best results
-
Use specific search terms. Broad queries like "science" or "biology" will match millions of records. Use precise phrases, combine multiple keywords, or use Boolean operators (AND, OR, NOT) directly in the query field for more targeted results.
-
Combine year filters with keywords. If you are tracking recent developments in a field, set
yearFromto the current year or the last few years. This dramatically narrows the result set and improves relevance. -
Enable the full-text filter when you need PDFs. If your workflow involves downloading and reading actual papers, set
fullTextOnlyto true. This ensures every result in your output has a workingdownloadUrlpointing to the full-text PDF. -
Use language filtering for non-English research. CORE indexes papers in dozens of languages. Use the language filter with ISO 639-1 codes (e.g., "de" for German, "fr" for French, "zh" for Chinese, "es" for Spanish) to find research that may be underrepresented in English-centric databases.
-
Test with a small maxResults first. Start with 10-20 results to verify your query returns relevant papers before scaling up to 500. This saves time and lets you iterate on your search terms quickly.
-
Schedule regular runs. Set up a recurring schedule on Apify to monitor new publications matching your query on a daily or weekly basis. Combine with Slack or email integrations to get notified when new papers are found.
-
Use Boolean operators in queries. The CORE API supports AND, OR, and NOT operators directly in the query string. For example:
"deep learning" AND "medical imaging" NOT surveywill find deep learning papers about medical imaging while excluding survey papers. -
Leverage the dry-run mode. Before entering your API key, run the actor without one to confirm that your query and filter settings are configured correctly. The dry-run output will show you the exact query that would be sent to CORE.
FAQ
Do I need a CORE API key to use this actor?
Yes, a CORE API key is required for live searches. Without one, the actor performs a dry run and returns a message explaining how to register. The key is completely free -- register at core.ac.uk/services/api and you will receive your key immediately.
What is CORE and how is it different from Google Scholar?
CORE (COnnecting REpositories) is the world's largest aggregator of open access research papers, harvesting content from over 10,000 data providers worldwide. It indexes more than 300 million metadata records and over 40 million full-text papers. Unlike Google Scholar, CORE focuses exclusively on open access content -- meaning every paper indexed is freely available to read and download. CORE also provides a structured API, making it ideal for programmatic access and bulk data retrieval.
Can I download the full PDF of papers?
Many papers in CORE have direct PDF download links. When you enable the fullTextOnly filter, the actor only returns papers that have a confirmed downloadable full-text URL. The downloadUrl field in the output contains the direct link to the PDF file. Additionally, the sourceFulltextUrls array may contain alternative download locations from the original repository or publisher.
How many papers can I retrieve per run?
The actor supports up to 500 papers per run. For larger datasets, you can run the actor multiple times with different queries, year ranges, or language filters, and merge the results using Apify's dataset management features or your own downstream processing pipeline.
What fields can I use for filtering?
You can filter by keyword query (which searches across titles, abstracts, and full text), publication year range (yearFrom and yearTo), and language code. The CORE API also supports advanced query syntax -- you can use Boolean operators (AND, OR, NOT) directly in the search query field for more precise control over your results.
What happens if a search returns zero results?
If your query has no matches, the actor will complete successfully and produce an empty dataset. Try broadening your search terms, removing year or language filters, or disabling the full-text filter to increase the number of matches.
How often is the CORE index updated?
CORE continuously harvests new content from its data providers. New papers are typically indexed within days of being deposited in a participating repository. Scheduling this actor to run regularly will help you capture newly indexed papers as they appear.
What languages are supported?
CORE indexes papers in dozens of languages. Use standard ISO 639-1 language codes in the language field: "en" (English), "de" (German), "fr" (French), "es" (Spanish), "pt" (Portuguese), "zh" (Chinese), "ja" (Japanese), "ko" (Korean), "ru" (Russian), "it" (Italian), "nl" (Dutch), "pl" (Polish), and many more.
Use cases
Systematic literature reviews
Researchers can use this actor to build comprehensive literature review datasets. Search by topic keywords, filter to a specific year range, and export the results to a spreadsheet for screening and annotation. The structured output with DOIs and download links makes it easy to locate and retrieve the full papers.
Research monitoring and alerting
Schedule the actor to run daily or weekly with your research topic as the query. Connect a Slack or email integration to get notified whenever new open access papers matching your interests are published. This is particularly useful for staying current in fast-moving fields.
Academic dataset construction
Build structured datasets of academic papers for bibliometric analysis, scientometric research, or training machine learning models. The 16 output fields provide rich metadata including citation counts, fields of study, and document types that are valuable for quantitative research analysis.
Competitive intelligence in research
Track what competitors, collaborators, or specific institutions are publishing by combining author names or institution keywords in your search queries. Monitor publication trends in your field to identify emerging topics and key contributors.
Open access compliance monitoring
Universities and research funders can use this actor to verify that funded research is being deposited in open access repositories. Search by grant keywords or author names and check the availability of full-text PDFs.
Content curation and knowledge management
Build curated collections of open access papers for educational resources, reading lists, or internal knowledge bases. The structured metadata makes it easy to organize and categorize papers by field of study, year, or document type.
Integrations
This actor works seamlessly with the Apify platform's integration ecosystem:
- Google Sheets -- Automatically export paper metadata to a spreadsheet for collaborative review and analysis.
- Slack -- Get real-time notifications when new papers matching your query are found during scheduled runs.
- Email -- Receive email digests of newly discovered papers on a recurring schedule.
- Zapier / Make -- Trigger downstream workflows whenever new academic papers are collected.
- Webhooks -- Push results to your own API endpoint for custom processing and storage.
- Amazon S3 -- Store datasets in your own S3 bucket for long-term archival and analysis.
- Google Drive -- Save output files directly to Google Drive for team access.
- GitHub -- Use the Apify API in CI/CD pipelines or research automation scripts.
Related actors
If you are working with academic research data, these related Apify actors may be useful for your workflow:
| Actor | Description |
|---|---|
| Semantic Scholar Paper Search | Search Semantic Scholar for AI-powered academic paper discovery with citation graphs and influence scores. |
| OpenAlex Research Paper Search | Search the OpenAlex database for academic works, authors, institutions, and research topics. |
| PubMed Biomedical Literature Search | Search PubMed and MEDLINE for biomedical and life science research papers with MeSH term filtering. |
| Crossref Academic Paper Search | Search Crossref for scholarly metadata across all academic disciplines with DOI resolution. |
| ArXiv Preprint Paper Search | Search ArXiv for preprint papers in physics, mathematics, computer science, and quantitative biology. |
| Europe PMC Literature Search | Search Europe PMC for life science literature, patents, and clinical guidelines. |
| DBLP Publication Search | Search DBLP for computer science publications, conference proceedings, and journal articles. |
| ORCID Researcher Search | Look up researchers by ORCID ID to find their publication history and affiliations. |