# Europe PMC Literature Scraper (`parseforge/europepmc-scraper`) Actor

Scrape Europe PMC for biomedical research papers. Search by title, author, MeSH terms, journal. Get DOI, abstract, full-text URLs, citations, references, open-access status. No API key required.

- **URL**: https://apify.com/parseforge/europepmc-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Education, Business, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $27.60 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🧬 Europe PMC Literature Scraper

> 🚀 **Export the biomedical literature index in seconds.** Search **40+ million records** across PubMed, PubMed Central, life-science preprints, agricultural literature, and patents. Filter by title, author, MeSH term, DOI, journal, open access, or free-text. No API key, no registration.

> 🕒 **Last updated:** 2026-05-13 · **📊 40+ fields** per record · **🧬 40M+ biomedical records** · **📚 PubMed + PMC + preprints + patents**

The **Europe PMC Literature Scraper** wraps the official Europe PMC REST API (`ebi.ac.uk/europepmc/webservices/rest/search`) and returns one row per article with **40+ fields**, including DOI, PMID, PMCID, abstract, full-text URLs, MeSH terms, keywords, journal, citation count, open-access status, and licensing. The underlying corpus is published by **Europe PMC**, the European mirror of PubMed Central, maintained by EMBL-EBI and funded by 32 life-science research funders worldwide.

The index covers **MEDLINE/PubMed**, **PubMed Central (full text)**, **Agricola** (USDA agricultural literature), **bioRxiv** and **medRxiv** preprints, **CTX patents**, and Europe PMC-curated content. Free-text and field-qualified queries (TITLE, AUTH, MESH, DOI, PMID, AFFILIATION, JOURNAL) compose freely with boolean operators. This Actor returns structured records ready to download as CSV, Excel, JSON, or XML.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Biomedical researchers, systematic-review teams, bibliometrics analysts, pharma intelligence, scientific publishers, science journalists, OA advocacy, ML training pipelines | Literature reviews, MeSH-term mining, author publication tracking, journal impact studies, drug-target evidence harvesting, training-set assembly |

---

### 📋 What the Europe PMC Scraper does

One programmable interface to the full Europe PMC search service:

- 🔍 **Field-qualified queries.** `TITLE:`, `AUTH:`, `AFFILIATION:`, `JOURNAL:`, `MESH:`, `DOI:`, `PMID:`, `PMCID:`, `OPEN_ACCESS:`, plus boolean operators (`AND`, `OR`, `NOT`) and quoted phrases.
- 📚 **Three response shapes.** `core` returns the full record with abstract, full-text URLs, and metadata. `lite` returns compact fields. `idlist` returns IDs only for ultra-fast scans.
- ⏱️ **Sort options.** Relevance (default), newest first, oldest first, or most cited.
- 🔁 **Cursor-mark pagination.** Fully automatic. Walks the entire result set efficiently for large queries.

Output captures the publication metadata (PMID, PMCID, DOI, source, journal title, ISSN, volume, issue, page info, publication year and date), full author list, abstract text, affiliation, language, publication types, MeSH headings, keywords, grant count, citation count, full-text URLs, license, open-access flag, and indexing dates.

> 💡 **Why it matters:** Europe PMC is the deepest open-access biomedical literature index in the world. The web UI is great for one-off lookups, but systematic reviews, bibliometric studies, and ML training-set assembly need flat rows. This Actor turns the search service into a downloadable dataset in one run.

---

### 🎬 Full Demo

_🚧 Coming soon: a 3-minute walkthrough showing how to build a MeSH query and export a literature review dataset._

---

### ⚙️ Input

<table>
<thead>
<tr><th>Input</th><th>Type</th><th>Default</th><th>Behavior</th></tr>
</thead>
<tbody>
<tr><td><code>query</code></td><td>string</td><td><code>"cancer immunotherapy"</code></td><td>Europe PMC query string. Supports field qualifiers and booleans.</td></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td><code>10</code></td><td>Records to return. Free plan caps at 10, paid plan at 1,000,000.</td></tr>
<tr><td><code>resultType</code></td><td>enum</td><td><code>"core"</code></td><td><code>core</code> = full record, <code>lite</code> = compact, <code>idlist</code> = IDs only.</td></tr>
<tr><td><code>sort</code></td><td>enum</td><td><code>""</code></td><td>Empty = relevance, <code>date_desc</code> = newest, <code>date_asc</code> = oldest, <code>cited</code> = most cited.</td></tr>
</tbody>
</table>

**Example: 50 most-cited CRISPR papers.**

```json
{
    "query": "CRISPR",
    "sort": "cited",
    "resultType": "core",
    "maxItems": 50
}
````

**Example: every open-access mRNA-vaccine paper by Jennifer Doudna.**

```json
{
    "query": "AUTH:\"Doudna J\" AND mRNA AND OPEN_ACCESS:Y",
    "sort": "date_desc",
    "resultType": "core",
    "maxItems": 100
}
```

> ⚠️ **Good to Know:** field qualifiers are case-sensitive (use `AUTH`, not `auth`). Quoted phrases preserve word order (`"breast cancer"`). The `MESH:` qualifier matches MeSH headings exactly. Use the Europe PMC search UI at europepmc.org to prototype complex queries before plugging them in.

***

### 📊 Output

Each record contains **40+ fields**. Download the dataset as CSV, Excel, JSON, or XML.

#### 🧾 Schema (selected fields)

| Field | Type | Example |
|---|---|---|
| 📛 `title` | string | `"RPA Combined With CRISPR/Cas12a for Rapid... MRSA Detection"` |
| 🆔 `id` | string | `"42002396"` |
| 🏷️ `source` | string | null | `"MED"` |
| 🔗 `url` | string | `"https://europepmc.org/article/MED/42002396"` |
| 🆔 `pmid` | string | `"42002396"` |
| 🆔 `pmcid` | string | null | `"PMC13092367"` |
| 🆔 `doi` | string | null | `"10.1002/jmr.70035"` |
| 👤 `authorString` | string | null | `"Chen L, Luo J, Zhang H, Zhao P."` |
| 👥 `authorList` | string\[] | `["Chen L", "Luo J", "Zhang H", "Zhao P"]` |
| 📓 `journalTitle` | string | null | `"Journal of molecular recognition : JMR"` |
| 🆔 `journalIssn` | string | null | `"0952-3499"` |
| 📅 `pubYear` | string | null | `"2026"` |
| 📅 `pubDate` | ISO 8601 | null | `"2026-05-01"` |
| 📝 `abstractText` | string | null | `"The increasing issue of infections caused by..."` |
| 🏢 `affiliation` | string | null | `"Department of Laboratory Medicine, Yuebei People's Hospital..."` |
| 🌐 `language` | string | null | `"eng"` |
| 🏷️ `publicationTypes` | string\[] | `["research-article", "Journal Article"]` |
| 🧬 `meshTerms` | string\[] | `["CRISPR-Cas Systems", "Methicillin-Resistant Staphylococcus aureus", ...]` |
| 🏷️ `keywords` | string\[] | `["Detection", "RPA", "MRSA", "Crispr/cas12a"]` |
| 💰 `grantsCount` | integer | null | `8` |
| 📊 `citedByCount` | integer | null | `0` |
| 📄 `hasPDF` | string | null | `"Y"` |
| 🔓 `isOpenAccess` | string | null | `"Y"` |
| 📜 `license` | string | null | `"cc by"` |
| 🔗 `fullTextUrls` | string\[] | `["https://doi.org/10.1002/jmr.70035", "https://europepmc.org/articles/PMC13092367", ...]` |
| 📅 `firstIndexDate` | ISO 8601 | null | `"2026-04-20"` |
| 📅 `firstPublicationDate` | ISO 8601 | null | `"2026-05-01"` |
| 🕒 `scrapedAt` | ISO 8601 | `"2026-05-12T21:31:34.280Z"` |

Additional fields when present: `journalVolume`, `issue`, `pageInfo`, `publicationStatus`, `hasBook`, `hasSuppl`, `hasReferences`, `hasTextMinedTerms`, `hasDbCrossReferences`, `inEPMC`, `inPMC`, `dateOfRevision`.

#### 📦 Sample record

<details>
<summary><strong>🧬 Open-access biomedical research article with full metadata</strong></summary>

```json
{
    "title": "RPA Combined With CRISPR/Cas12a for Rapid and Ultrasensitive Detection Dual-Gene of Methicillin-Resistant Staphylococcus aureus (MRSA).",
    "id": "42002396",
    "source": "MED",
    "url": "https://europepmc.org/article/MED/42002396",
    "pmid": "42002396",
    "pmcid": "PMC13092367",
    "doi": "10.1002/jmr.70035",
    "authorString": "Chen L, Luo J, Zhang H, Zhao P.",
    "authorList": ["Chen L", "Luo J", "Zhang H", "Zhao P"],
    "journalTitle": "Journal of molecular recognition : JMR",
    "journalIssn": "0952-3499",
    "journalVolume": "39",
    "issue": "3",
    "pubYear": "2026",
    "pubDate": "2026-05-01",
    "abstractText": "The increasing issue of infections caused by methicillin-resistant Staphylococcus aureus (MRSA) necessitates rapid and reliable diagnostic methods...",
    "affiliation": "Department of Laboratory Medicine, Yuebei People's Hospital Affiliated to Shantou University Medical College, Shaoguan, China.",
    "language": "eng",
    "publicationTypes": ["research-article", "Journal Article"],
    "meshTerms": ["Humans", "Staphylococcal Infections", "Endodeoxyribonucleases", "Penicillin-Binding Proteins", "CRISPR-Cas Systems", "CRISPR-Associated Proteins"],
    "keywords": ["Detection", "RPA", "MRSA", "Crispr/cas12a"],
    "grantsCount": 8,
    "citedByCount": 0,
    "hasPDF": "Y",
    "isOpenAccess": "Y",
    "license": "cc by",
    "fullTextUrls": [
        "https://doi.org/10.1002/jmr.70035",
        "https://europepmc.org/articles/PMC13092367",
        "https://europepmc.org/articles/PMC13092367?pdf=render"
    ],
    "firstIndexDate": "2026-04-20",
    "firstPublicationDate": "2026-05-01",
    "scrapedAt": "2026-05-12T21:31:34.280Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 🧬 | **40M+ records.** PubMed, PMC, preprints, patents, Agricola. The full Europe PMC index. |
| 🔍 | **Field-qualified queries.** TITLE, AUTH, MESH, DOI, PMID, AFFILIATION, JOURNAL, OPEN\_ACCESS, plus booleans. |
| 📚 | **3 response shapes.** Full record (`core`), compact (`lite`), or IDs only (`idlist`). |
| 📊 | **40+ fields per record.** DOI, abstract, full-text URLs, MeSH, keywords, citation count, license. |
| 🔓 | **Open-access aware.** `isOpenAccess`, `license`, and full-text URLs surfaced per record. |
| ⏱️ | **Cursor-mark pagination.** Efficient walk across the entire result set, automatic. |
| ⚡ | **Fast.** 10 articles in under 3 seconds, 10,000 records in under a minute. |
| 🚫 | **No authentication.** Europe PMC publishes under open licenses. No API key needed. |

> 📊 Europe PMC is the canonical European mirror of biomedical literature. The REST API is the source, this Actor turns it into rows.

***

### 📈 How it compares to alternatives

| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| **⭐ Europe PMC Scraper** *(this Actor)* | $5 free credit, then pay-per-use | **40M+ records** | **Live per run** | field qualifiers, booleans, open-access, sort | ⚡ 2 min |
| Manual export from europepmc.org | Free | 1,000 records per export | On demand | UI filters | 🐢 Slow, no automation |
| PubMed `eutils` (NCBI) | Free | PubMed only, rate-limited | Live | Custom XML | 🛠️ Hours of engineering |
| Commercial bibliographic databases | $$$$ | Curated | Vendor cadence | Vendor-specific | ⏳ Days |

Pick this Actor when you want a programmable interface to the full Europe PMC index with consistent flat-row output.

***

### 🚀 How to use

1. 📝 **Sign up.** [Create a free account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp) (takes 2 minutes).
2. 🌐 **Open the Actor.** Go to the Europe PMC Literature Scraper page on the Apify Store.
3. 🔍 **Build a query.** Free-text or use field qualifiers (`AUTH:"Doudna J" AND CRISPR`).
4. 📚 **Pick a response shape.** `core` for full metadata, `lite` for compact, `idlist` for IDs only.
5. 🚀 **Run it.** Click **Start** and let the Actor collect your data.
6. 📥 **Download.** Grab your results in the **Dataset** tab as CSV, Excel, JSON, or XML.

> ⏱️ Total time from signup to downloaded dataset: **3-5 minutes.** No coding required.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%" valign="top">

#### 💊 Pharma & Biotech Intelligence

- Drug-target evidence harvesting
- Competitor pipeline literature monitoring
- Clinical-trial-adjacent reference sets
- KOL (key opinion leader) discovery via affiliation

</td>
<td width="50%" valign="top">

#### 📚 Systematic Reviews

- Inclusion / exclusion screening pools
- MeSH-term traversal for review protocols
- Cited-by tracking for snowball sampling
- Open-access full-text URL collection

</td>
</tr>
<tr>
<td width="50%" valign="top">

#### 📊 Bibliometrics & Research Ops

- Citation networks for impact studies
- Journal-level publication trends
- Institution affiliation analysis
- Funding-source landscape reports

</td>
<td width="50%" valign="top">

#### 🤖 ML & NLP Training

- Biomedical NER training corpora
- Abstract-summarization training sets
- MeSH-classification benchmark assembly
- Author-disambiguation training data

</td>
</tr>
</table>

***

### 🔌 Automating Europe PMC Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

- 🟢 **Node.js.** Install the `apify-client` NPM package.
- 🐍 **Python.** Use the `apify-client` PyPI package.
- 📚 See the [Apify API documentation](https://docs.apify.com/api/v2) for full details.

The [Apify Schedules feature](https://docs.apify.com/platform/schedules) lets you trigger this Actor on any cron interval. Daily literature alerts on a saved query keep your research front-of-mind.

***

### 🌟 Beyond business use cases

The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Reproducible literature-search appendices for papers
- Open-data assignments for bibliometrics coursework
- Cross-disciplinary citation network studies
- Funding-impact evaluations for grant reports

</td>
<td width="50%">

#### 🎨 Personal and creative

- Side projects on biomedical knowledge graphs
- Science-communication content backed by real papers
- Reading-list builders for graduate students
- Personal alerting on niche research topics

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Open-access advocacy benchmarks
- Public-interest journalism on drug research
- Patient-advocacy literature collections
- NGO reports on global-health publication trends

</td>
<td width="50%">

#### 🧪 Experimentation

- Train domain-specific LLMs on biomedical abstracts
- Validate RAG pipelines with real citations
- Prototype agents that answer literature questions
- Test recommender systems with citation graphs

</td>
</tr>
</table>

***

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20Europe%20PMC%20Literature%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20Europe%20PMC%20Literature%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20Europe%20PMC%20Literature%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20Europe%20PMC%20Literature%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

***

### ❓ Frequently Asked Questions

#### 🧩 How does it work?

The Actor hits the official Europe PMC REST search endpoint (`ebi.ac.uk/europepmc/webservices/rest/search`), uses cursor-mark pagination to walk through every result, and returns one structured row per article. No HTML scraping, no captcha, no setup.

#### 🔍 Which query qualifiers are supported?

`TITLE:`, `AUTH:`, `AFFILIATION:`, `JOURNAL:`, `MESH:`, `DOI:`, `PMID:`, `PMCID:`, `OPEN_ACCESS:`, `FIRST_PDATE:`, plus boolean operators `AND`, `OR`, `NOT` and quoted phrases. Mix freely. Example: `AUTH:"Doudna J" AND (mRNA OR vaccine) AND OPEN_ACCESS:Y`.

#### 📚 What does each result type return?

- `core` returns the full record with abstract, full-text URLs, MeSH terms, keywords, license, and all metadata.
- `lite` returns compact fields (title, authors, journal, DOI, year) for fast scans.
- `idlist` returns IDs only, useful for downstream batch queries against other services.

#### 📊 How big is the Europe PMC index?

Europe PMC indexes over 40 million biomedical literature records as of 2026, including all of MEDLINE/PubMed, full-text PubMed Central content, life-science preprints from bioRxiv and medRxiv, Agricola, and patents.

#### 🔓 How do I find only open-access papers?

Add `AND OPEN_ACCESS:Y` to your query. The output also includes per-record `isOpenAccess`, `license`, and `fullTextUrls` fields for fine-grained filtering.

#### ⏱️ How does sorting work?

`date_desc` sorts by first publication date descending (newest first). `date_asc` sorts ascending (oldest first). `cited` sorts by citation count descending. Empty `sort` uses Europe PMC relevance ranking.

#### ⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor on any cron interval. Daily alerts on a saved query keep your research current.

#### ⚖️ Is this data legal to use?

Europe PMC content is freely available for research and educational use. Full-text articles may be under various open licenses (CC BY, CC BY-NC, etc.). The `license` field in each record tells you which one. Review the specific license before commercial redistribution of full text.

#### 💼 Can I use this data commercially?

Metadata (title, abstract, authors, DOI) is generally freely usable. Full-text reuse depends on the per-article license. Open-access articles with CC BY are the safest for commercial use.

#### 💳 Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit to 1,000,000 records.

#### 🔁 What happens if a run fails?

Apify automatically retries transient errors. If a run still fails, you can inspect the log in the Runs tab, fix the input (usually a malformed query), and re-run.

#### 🆘 What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.

***

### 🔌 Integrate with any app

Europe PMC Scraper connects to any cloud service via [Apify integrations](https://apify.com/integrations):

- [**Make**](https://docs.apify.com/platform/integrations/make) - Automate multi-step workflows
- [**Zapier**](https://docs.apify.com/platform/integrations/zapier) - Connect with 5,000+ apps
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - Daily literature alerts in your channels
- [**Airbyte**](https://docs.apify.com/platform/integrations/airbyte) - Pipe abstracts into your warehouse
- [**GitHub**](https://docs.apify.com/platform/integrations/github) - Trigger runs from commits and releases
- [**Google Drive**](https://docs.apify.com/platform/integrations/drive) - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh abstracts into your knowledge base, or alert your team in Slack on new papers.

***

### 🔗 Recommended Actors

- [**🤖 Hugging Face Model Scraper**](https://apify.com/parseforge/hugging-face-model-scraper) - ML model registry metadata
- [**🇪🇺 Eurostat Statistics Scraper**](https://apify.com/parseforge/destatis-genesis-scraper) - 7,500+ Eurostat datasets
- [**📊 ClinicalTrials.gov Scraper**](https://apify.com/parseforge/clinicaltrials-gov-scraper) - Clinical trial registry
- [**📚 Figshare Research Output Scraper**](https://apify.com/parseforge/figshare-scraper) - Open research datasets
- [**🔬 OSF Open Science Framework Scraper**](https://apify.com/parseforge/osf-scraper) - Open-science project metadata

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more reference-data scrapers.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) to request a new scraper, propose a custom data project, or report an issue.

***

> **⚠️ Disclaimer:** this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Europe PMC, EMBL-EBI, the European Bioinformatics Institute, the National Center for Biotechnology Information, or any of the 32 funders supporting Europe PMC. All trademarks mentioned are the property of their respective owners. Only publicly available open data from the official Europe PMC REST API is collected.

# Actor input Schema

## `query` (type: `string`):

Europe PMC query string. Supports field qualifiers (TITLE:, AUTH:, JOURNAL:, AFFILIATION:, MESH:, DOI:, PMID:, PMCID:), booleans (AND/OR/NOT), and quoted phrases. Examples: 'CRISPR', 'AUTH:"Doudna J"', 'mRNA vaccine AND (OPEN\_ACCESS:Y)'.

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `resultType` (type: `string`):

lite = compact fields. core = full record with abstract, full-text URLs, etc. idlist = IDs only.

## `sort` (type: `string`):

Sort order. Leave blank for relevance.

## Actor input object example

```json
{
  "query": "cancer immunotherapy",
  "maxItems": 10,
  "resultType": "core",
  "sort": ""
}
```

# Actor output Schema

## `overview` (type: `string`):

Overview of scraped data

## `fullData` (type: `string`):

Complete dataset

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "cancer immunotherapy",
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/europepmc-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "query": "cancer immunotherapy",
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/europepmc-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "cancer immunotherapy",
  "maxItems": 10
}' |
apify call parseforge/europepmc-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/europepmc-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Europe PMC Literature Scraper",
        "description": "Scrape Europe PMC for biomedical research papers. Search by title, author, MeSH terms, journal. Get DOI, abstract, full-text URLs, citations, references, open-access status. No API key required.",
        "version": "0.0",
        "x-build-id": "GlZfkNTsh3LdPMWje"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~europepmc-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-europepmc-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~europepmc-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-europepmc-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~europepmc-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-europepmc-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "query"
                ],
                "properties": {
                    "query": {
                        "title": "Query",
                        "type": "string",
                        "description": "Europe PMC query string. Supports field qualifiers (TITLE:, AUTH:, JOURNAL:, AFFILIATION:, MESH:, DOI:, PMID:, PMCID:), booleans (AND/OR/NOT), and quoted phrases. Examples: 'CRISPR', 'AUTH:\"Doudna J\"', 'mRNA vaccine AND (OPEN_ACCESS:Y)'.",
                        "default": "cancer immunotherapy"
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "resultType": {
                        "title": "Result type",
                        "enum": [
                            "core",
                            "lite",
                            "idlist"
                        ],
                        "type": "string",
                        "description": "lite = compact fields. core = full record with abstract, full-text URLs, etc. idlist = IDs only.",
                        "default": "core"
                    },
                    "sort": {
                        "title": "Sort",
                        "enum": [
                            "",
                            "date_desc",
                            "date_asc",
                            "cited"
                        ],
                        "type": "string",
                        "description": "Sort order. Leave blank for relevance.",
                        "default": ""
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
