Pricing

from $2.00 / 1,000 paper fetcheds

CORE Open Access Paper Search

Search 300M+ open access academic papers via CORE API. Find research papers by keywords, year range & language. Extract titles, authors, abstracts, DOIs, citation counts, journal names, fields of study & PDF download links. Ideal for literature reviews & research monitoring.

Pricing

from $2.00 / 1,000 paper fetcheds

Rating

0.0

(0)

Developer

ryan clinton

Actor stats

Bookmarked

Total users

Monthly active users

4 hours ago

Last modified

What does CORE Open Access Paper Search do?

CORE Open Access Paper Search is an Apify actor that connects to the CORE API v3 to search and retrieve structured metadata from the world's largest collection of open access research outputs. CORE harvests content from over 10,000 institutional repositories, journal publishers, and preprint servers across the globe, providing programmatic access to more than 300 million metadata records and over 40 million full-text papers.

This actor lets you search that massive corpus by keywords, filter results by publication year range and language code, and optionally restrict output to only papers that have a downloadable full-text PDF. Each result includes 16 structured fields covering the paper title, author list, abstract, DOI, journal name, publisher, field of study, citation count, document type, language, and direct links to both the CORE page and the downloadable PDF.

The actor handles multi-page API responses automatically using offset-based pagination with built-in 200ms delays between requests to stay within CORE's usage policies. You can retrieve up to 500 papers per run.

Key capabilities:

Search across 300M+ open access metadata records and 40M+ full-text papers
Filter by keyword query, publication year range, and language code
Restrict to papers with downloadable full-text PDFs only
Get 16 structured metadata fields per paper including DOI, authors, abstract, citation count, and download URL
Automatic pagination and rate limiting built in
Dry-run mode when no API key is provided, with instructions on how to register for free

Why use CORE Open Access Paper Search on Apify?

Running this actor on the Apify platform gives you several advantages over calling the CORE API directly:

No infrastructure needed. The actor runs in the cloud. No servers to manage, no dependencies to install, no pagination logic to write.
Scheduled runs. Configure the actor to run on a daily, weekly, or custom schedule to automatically monitor new publications matching your query.
Built-in integrations. Export results directly to Google Sheets, Slack, Zapier, Make, webhooks, or any other system through the Apify integration ecosystem.
Scalable data collection. Retrieve up to 500 papers per run with automatic pagination across multiple API pages, all handled transparently.
Structured output. Results come as clean, normalized JSON records ready for analysis, database import, or feeding into downstream actors and workflows.
API and SDK access. Trigger runs and retrieve results programmatically using the Apify API or official Python and JavaScript client libraries.
Dataset management. Store, version, and export datasets in JSON, CSV, Excel, XML, or RSS formats directly from the Apify console.

How to get a free CORE API key

This actor requires a CORE API key for live searches. The key is completely free to obtain:

Visit https://core.ac.uk/services/api
Click "Register" and create an account
After registration, your API key will be available in your CORE dashboard
Copy the key and paste it into the apiKey field when configuring this actor

The free tier provides generous daily request limits that are more than sufficient for most research and data collection workflows.

If you run the actor without providing an API key, it performs a dry run -- returning a message that confirms your query configuration and explains how to register for a key. This lets you verify your input settings before committing to a live search.

Input parameters

Parameter	Type	Required	Default	Description
`apiKey`	String	No	--	Your CORE API key. Register free at core.ac.uk/services/api. Without a key, the actor performs a dry run.
`query`	String	Yes	--	Keywords to search for in academic papers. Supports Boolean operators (AND, OR, NOT).
`yearFrom`	Integer	No	--	Filter papers published from this year onwards (e.g., 2020).
`yearTo`	Integer	No	--	Filter papers published up to and including this year (e.g., 2025).
`language`	String	No	--	ISO 639-1 language code to filter results (e.g., "en", "de", "fr", "es", "zh").
`fullTextOnly`	Boolean	No	false	When enabled, only papers with a downloadable full-text PDF are returned.
`maxResults`	Integer	No	50	Maximum number of papers to retrieve per run (up to 500).

Input example

{
    "apiKey": "YOUR_CORE_API_KEY",
    "query": "large language models",
    "yearFrom": 2022,
    "yearTo": 2025,
    "language": "en",
    "fullTextOnly": true,
    "maxResults": 100
}

Output format

Each paper in the output dataset is a JSON object with 16 fields:

Field	Type	Description
`coreId`	Number	Unique CORE identifier for the paper
`doi`	String or null	Digital Object Identifier, if available
`title`	String	Title of the paper
`authors`	Array of Strings	List of author names
`abstract`	String or null	Paper abstract text
`yearPublished`	Number or null	Year of publication
`publisher`	String or null	Publisher name
`journalName`	String or null	Name of the journal
`downloadUrl`	String or null	Direct URL to download the full-text PDF
`sourceFulltextUrls`	Array of Strings	Additional URLs where the full text is available
`fieldOfStudy`	String or null	Primary field of study
`citationCount`	Number or null	Number of citations
`language`	String or null	Language code of the paper
`documentType`	String or null	Type of document (e.g., research-article, thesis)
`coreUrl`	String	URL to the paper's page on core.ac.uk
`extractedAt`	String	ISO 8601 timestamp of when the data was extracted

Output example

{
    "coreId": 287146253,
    "doi": "10.1038/s41586-023-06221-2",
    "title": "Scaling language models: Methods, analysis & insights from training Gopher",
    "authors": [
        "Jack W. Rae",
        "Sebastian Borgeaud",
        "Trevor Cai",
        "Katie Millican",
        "Jordan Hoffmann"
    ],
    "abstract": "Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge. This paper presents an analysis of Transformer-based language model performance across a wide range of model scales...",
    "yearPublished": 2022,
    "publisher": "Nature Publishing Group",
    "journalName": "Nature",
    "downloadUrl": "https://core.ac.uk/download/287146253.pdf",
    "sourceFulltextUrls": [
        "https://arxiv.org/pdf/2112.11446"
    ],
    "fieldOfStudy": "Computer Science",
    "citationCount": 1542,
    "language": "en",
    "documentType": "research-article",
    "coreUrl": "https://core.ac.uk/works/287146253",
    "extractedAt": "2026-02-10T14:30:00.000Z"
}

How to use CORE Open Access Paper Search

Step 1: Get your free API key

Step 2: Configure your search

Enter your API key, search query, and any optional filters. You can test your configuration first by leaving the API key blank -- the actor will perform a dry run and confirm your query settings without making any API calls.

Step 3: Run the actor

Click "Start" in the Apify console, or trigger the run programmatically via the API. The actor will search CORE, paginate through all matching results, and push structured paper records to the output dataset.

Step 4: Export your results

Download the dataset in JSON, CSV, Excel, XML, or RSS format. You can also connect integrations to automatically forward results to Google Sheets, Slack, Zapier, Make, or your own webhook endpoint.

How much does it cost to run?

CORE Open Access Paper Search is extremely lightweight. It makes HTTP API calls to the CORE v3 endpoint without any browser rendering, so compute costs are minimal.

Scenario	Papers	Approximate run time	Estimated Apify cost
Quick test	10	5-10 seconds	~$0.001
Standard run	50	10-30 seconds	~$0.002
Medium batch	200	30-60 seconds	~$0.005
Maximum run	500	1-2 minutes	~$0.01

Memory usage: 256 MB RAM
CORE API key: Free to register with generous daily request limits
No browser required: Pure API calls keep costs extremely low

Programmatic access

You can trigger this actor and retrieve results programmatically using the Apify API or the official client libraries.

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_API_TOKEN")

run_input = {
    "apiKey": "YOUR_CORE_API_KEY",
    "query": "transformer neural networks",
    "yearFrom": 2023,
    "yearTo": 2025,
    "language": "en",
    "fullTextOnly": True,
    "maxResults": 100,
}

run = client.actor("Jh4Y6VfuSZkxkF8eq").call(run_input=run_input)

for paper in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{paper['title']} ({paper['yearPublished']})")
    print(f"  DOI: {paper['doi']}")
    print(f"  Authors: {', '.join(paper['authors'])}")
    print(f"  Download: {paper['downloadUrl']}")
    print()

JavaScript

import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_APIFY_API_TOKEN" });

const run = await client.actor("Jh4Y6VfuSZkxkF8eq").call({
    apiKey: "YOUR_CORE_API_KEY",
    query: "renewable energy storage",
    yearFrom: 2022,
    yearTo: 2025,
    fullTextOnly: true,
    maxResults: 50,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();

for (const paper of items) {
    console.log(`${paper.title} (${paper.yearPublished})`);
    console.log(`  DOI: ${paper.doi}`);
    console.log(`  Authors: ${paper.authors.join(", ")}`);
    console.log(`  Download: ${paper.downloadUrl}`);
}

cURL

curl -X POST "https://api.apify.com/v2/acts/Jh4Y6VfuSZkxkF8eq/runs?token=YOUR_APIFY_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "apiKey": "YOUR_CORE_API_KEY",
    "query": "CRISPR gene editing",
    "yearFrom": 2020,
    "fullTextOnly": true,
    "maxResults": 50
  }'

Tips for best results

Use specific search terms. Broad queries like "science" or "biology" will match millions of records. Use precise phrases, combine multiple keywords, or use Boolean operators (AND, OR, NOT) directly in the query field for more targeted results.
Combine year filters with keywords. If you are tracking recent developments in a field, set yearFrom to the current year or the last few years. This dramatically narrows the result set and improves relevance.
Enable the full-text filter when you need PDFs. If your workflow involves downloading and reading actual papers, set fullTextOnly to true. This ensures every result in your output has a working downloadUrl pointing to the full-text PDF.
Use language filtering for non-English research. CORE indexes papers in dozens of languages. Use the language filter with ISO 639-1 codes (e.g., "de" for German, "fr" for French, "zh" for Chinese, "es" for Spanish) to find research that may be underrepresented in English-centric databases.
Test with a small maxResults first. Start with 10-20 results to verify your query returns relevant papers before scaling up to 500. This saves time and lets you iterate on your search terms quickly.
Schedule regular runs. Set up a recurring schedule on Apify to monitor new publications matching your query on a daily or weekly basis. Combine with Slack or email integrations to get notified when new papers are found.
Use Boolean operators in queries. The CORE API supports AND, OR, and NOT operators directly in the query string. For example: "deep learning" AND "medical imaging" NOT survey will find deep learning papers about medical imaging while excluding survey papers.
Leverage the dry-run mode. Before entering your API key, run the actor without one to confirm that your query and filter settings are configured correctly. The dry-run output will show you the exact query that would be sent to CORE.

FAQ

Do I need a CORE API key to use this actor?

Yes, a CORE API key is required for live searches. Without one, the actor performs a dry run and returns a message explaining how to register. The key is completely free -- register at core.ac.uk/services/api and you will receive your key immediately.

What is CORE and how is it different from Google Scholar?

CORE (COnnecting REpositories) is the world's largest aggregator of open access research papers, harvesting content from over 10,000 data providers worldwide. It indexes more than 300 million metadata records and over 40 million full-text papers. Unlike Google Scholar, CORE focuses exclusively on open access content -- meaning every paper indexed is freely available to read and download. CORE also provides a structured API, making it ideal for programmatic access and bulk data retrieval.

Can I download the full PDF of papers?

Many papers in CORE have direct PDF download links. When you enable the fullTextOnly filter, the actor only returns papers that have a confirmed downloadable full-text URL. The downloadUrl field in the output contains the direct link to the PDF file. Additionally, the sourceFulltextUrls array may contain alternative download locations from the original repository or publisher.

How many papers can I retrieve per run?

The actor supports up to 500 papers per run. For larger datasets, you can run the actor multiple times with different queries, year ranges, or language filters, and merge the results using Apify's dataset management features or your own downstream processing pipeline.

What fields can I use for filtering?

You can filter by keyword query (which searches across titles, abstracts, and full text), publication year range (yearFrom and yearTo), and language code. The CORE API also supports advanced query syntax -- you can use Boolean operators (AND, OR, NOT) directly in the search query field for more precise control over your results.

What happens if a search returns zero results?

If your query has no matches, the actor will complete successfully and produce an empty dataset. Try broadening your search terms, removing year or language filters, or disabling the full-text filter to increase the number of matches.

How often is the CORE index updated?

CORE continuously harvests new content from its data providers. New papers are typically indexed within days of being deposited in a participating repository. Scheduling this actor to run regularly will help you capture newly indexed papers as they appear.

What languages are supported?

CORE indexes papers in dozens of languages. Use standard ISO 639-1 language codes in the language field: "en" (English), "de" (German), "fr" (French), "es" (Spanish), "pt" (Portuguese), "zh" (Chinese), "ja" (Japanese), "ko" (Korean), "ru" (Russian), "it" (Italian), "nl" (Dutch), "pl" (Polish), and many more.

Use cases

Systematic literature reviews

Researchers can use this actor to build comprehensive literature review datasets. Search by topic keywords, filter to a specific year range, and export the results to a spreadsheet for screening and annotation. The structured output with DOIs and download links makes it easy to locate and retrieve the full papers.

Research monitoring and alerting

Schedule the actor to run daily or weekly with your research topic as the query. Connect a Slack or email integration to get notified whenever new open access papers matching your interests are published. This is particularly useful for staying current in fast-moving fields.

Academic dataset construction

Build structured datasets of academic papers for bibliometric analysis, scientometric research, or training machine learning models. The 16 output fields provide rich metadata including citation counts, fields of study, and document types that are valuable for quantitative research analysis.

Competitive intelligence in research

Track what competitors, collaborators, or specific institutions are publishing by combining author names or institution keywords in your search queries. Monitor publication trends in your field to identify emerging topics and key contributors.

Open access compliance monitoring

Universities and research funders can use this actor to verify that funded research is being deposited in open access repositories. Search by grant keywords or author names and check the availability of full-text PDFs.

Content curation and knowledge management

Build curated collections of open access papers for educational resources, reading lists, or internal knowledge bases. The structured metadata makes it easy to organize and categorize papers by field of study, year, or document type.

Integrations

This actor works seamlessly with the Apify platform's integration ecosystem:

Google Sheets -- Automatically export paper metadata to a spreadsheet for collaborative review and analysis.
Slack -- Get real-time notifications when new papers matching your query are found during scheduled runs.
Email -- Receive email digests of newly discovered papers on a recurring schedule.
Zapier / Make -- Trigger downstream workflows whenever new academic papers are collected.
Webhooks -- Push results to your own API endpoint for custom processing and storage.
Amazon S3 -- Store datasets in your own S3 bucket for long-term archival and analysis.
Google Drive -- Save output files directly to Google Drive for team access.
GitHub -- Use the Apify API in CI/CD pipelines or research automation scripts.

If you are working with academic research data, these related Apify actors may be useful for your workflow:

Actor	Description
Semantic Scholar Paper Search	Search Semantic Scholar for AI-powered academic paper discovery with citation graphs and influence scores.
OpenAlex Research Paper Search	Search the OpenAlex database for academic works, authors, institutions, and research topics.
PubMed Biomedical Literature Search	Search PubMed and MEDLINE for biomedical and life science research papers with MeSH term filtering.
Crossref Academic Paper Search	Search Crossref for scholarly metadata across all academic disciplines with DOI resolution.
ArXiv Preprint Paper Search	Search ArXiv for preprint papers in physics, mathematics, computer science, and quantitative biology.
Europe PMC Literature Search	Search Europe PMC for life science literature, patents, and clinical guidelines.
DBLP Publication Search	Search DBLP for computer science publications, conference proceedings, and journal articles.
ORCID Researcher Search	Look up researchers by ORCID ID to find their publication history and affiliations.

Semantic Scholar Paper Scraper

agenscrape/semantic-scholar-paper-scraper

Scrape academic papers from Semantic Scholar. Search by keyword and extract paper titles, abstracts, authors, citation counts, publication dates, DOIs, open access PDFs... Perfect for literature reviews, citation analysis, and research databases. Real time data output with pagination support.

Agenscrape

Academic Paper Scraper

labrat011/academic-paper-scraper

Search MILLIONS of academic papers from Semantic Scholar and arXiv by keyword, DOI, or citation graph. Returns titles, authors, abstracts, citation counts, and open access PDFs as clean JSON. Works as an MCP tool for AI agents.

Mick

Semantic Scholar Paper Search

ryanclinton/semantic-scholar-search

Search and extract data from 200M+ academic papers via Semantic Scholar API. Filter by keyword, year, venue, field of study, citation count, and open access. Returns titles, abstracts, AI summaries (TLDR), authors, DOIs, ArXiv IDs, and PDF links. No API key required.

ryan clinton

arXiv Paper Scraper

cloud9_ai/arxiv-paper-scraper

Scrape academic papers from arXiv.org. Search by keyword, browse categories, or get latest papers. Extract titles, abstracts, authors, PDF links, and citation data via arXiv API.

cloud9

Academic Paper Scraper

constant_quadruped/academic-paper-scraper

Search arXiv and PubMed in one request. Returns unified paper data: titles, authors, abstracts, DOIs, and PDF links. Filter by keywords, authors, categories, and date range. Built-in rate limiting and cross-source deduplication. Export to JSON, CSV, or Excel.

ArXiv Academic Paper Scraper

fortuitous_pirate/arxiv-scraper

Scrape academic papers from ArXiv. Extract titles, authors, abstracts, categories, and PDF links. Essential for research and literature reviews.

Fortuitous Pirate

OpenAlex Academic Research Scraper - Scholarly Papers

cloud9_ai/openalex-scraper

Search and extract academic papers, authors, institutions, and research topics from OpenAlex. Free open API covering 250M+ scholarly works. Get citations, abstracts, open access URLs.

cloud9

Crossref Academic Paper Search

ryanclinton/crossref-paper-search

Search 150M+ scholarly papers via Crossref API. Filter by keywords, author, journal, DOI prefix, publication type, and year range. Returns DOIs, citations, authors with ORCID, abstracts, funding data, and publisher metadata. Free, no API key needed.

ryan clinton

OpenAlex Research Paper Search

ryanclinton/openalex-research-search

Search 250M+ academic papers, journal articles & scholarly works via OpenAlex API. Filter by keyword, publication year, citation count & open access. Returns authors, affiliations, DOI, concepts. Free, no API key.

ryan clinton

arXiv Scraper

artificially/arxiv-scraper

Search and extract academic papers from arXiv.org. Get paper titles, authors, abstracts, categories, and PDF links for AI/ML, physics, math, and more.

Artificially

CORE Open Access Paper Search

What does CORE Open Access Paper Search do?

Why use CORE Open Access Paper Search on Apify?

How to get a free CORE API key

Input parameters

Input example

Output format

Output example

How to use CORE Open Access Paper Search

Step 1: Get your free API key

Step 2: Configure your search

Step 3: Run the actor

Step 4: Export your results

How much does it cost to run?

Programmatic access

Python

JavaScript

cURL

Tips for best results

FAQ

Use cases

Systematic literature reviews

Research monitoring and alerting

Academic dataset construction

Competitive intelligence in research

Open access compliance monitoring

Content curation and knowledge management

Integrations

Related actors

You might also like

Semantic Scholar Paper Scraper

Academic Paper Scraper

Semantic Scholar Paper Search

arXiv Paper Scraper

Academic Paper Scraper

ArXiv Academic Paper Scraper

OpenAlex Academic Research Scraper - Scholarly Papers

Crossref Academic Paper Search

OpenAlex Research Paper Search

arXiv Scraper