Europe PMC Scraper avatar

Europe PMC Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Europe PMC Scraper

Europe PMC Scraper

Scrape Europe PMC, 42M+ biomedical literature records including PubMed, PubMed Central, patents, and preprints. Search publications, get article details by PMID or DOI, and retrieve citation/reference lists.

Pricing

from $3.00 / 1,000 results

Rating

5.0

(7)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

7

Bookmarked

2

Total users

1

Monthly active users

7 days ago

Last modified

Share

Extract biomedical literature from Europe PMC — one of the world's most comprehensive repositories of life science publications, covering 42 million+ records including PubMed/MEDLINE, PubMed Central full-text articles, patents, preprints, theses, and more.

What Is Europe PMC?

Europe PMC is a free, open access repository of biomedical and life sciences literature maintained by the European Bioinformatics Institute (EMBL-EBI). It aggregates content from multiple sources including PubMed, PubMed Central (PMC), clinical trial records, patents, preprints, and theses — making it the most comprehensive biomedical literature database freely available.

What This Actor Does

This actor queries the Europe PMC REST API to:

  • Search publications across all sources by keyword, year range, source database, and open access status
  • Retrieve a specific article by PubMed ID (PMID) or DOI
  • Get citation lists — all articles citing a given PMID
  • Get reference lists — all references of a given PMID

No authentication or API key is required.

Modes

ModeDescriptionKey Parameters
searchFull-text search across 42M+ publicationsquery, source, isOpenAccess, sortBy, fromYear, toYear
byPMIDGet a specific article by PubMed IDpmid
byDOIGet a specific article by DOIdoi
citationsGet articles that cite a PMIDpmid
referencesGet references of a PMIDpmid

Input Parameters

ParameterTypeDescription
modeSelectOperating mode (required)
queryStringSearch query. Supports field operators like TITLE:malaria AND OPEN_ACCESS:y
pmidStringPubMed ID (for byPMID, citations, references modes)
doiStringArticle DOI (for byDOI mode)
sourceSelectFilter by source: MED, PMC, PAT, ETH, HIR, CTX, AGR, CBA, PPR
isOpenAccessBooleanReturn only open access articles
sortBySelectSort: relevance (default), cited (most cited), date (most recent)
fromYearIntegerFilter from this publication year
toYearIntegerFilter up to this publication year
maxItemsIntegerMaximum records (1–1000, default 50)

Source Database Codes

CodeDescription
MEDPubMed/MEDLINE
PMCEurope PMC full-text articles
PATPatents
ETHEthOS British Library theses
HIRHealth Improvement Research
CTXClinicalTrials.gov
AGRAgricola
CBACBA
PPRPreprints

Output Fields

FieldTypeDescription
pmidStringPubMed ID
pmcidStringPubMed Central ID (e.g. PMC4371661)
doiStringDigital Object Identifier
titleStringPublication title
authorsArrayAuthor names
journalNameStringJournal full name
journalIssnStringJournal ISSN
pubYearIntegerPublication year
abstractStringFull abstract text
languageStringPublication language (e.g. eng)
isOpenAccessBooleanWhether freely available
hasPDFBooleanWhether PDF is available
citedByCountIntegerNumber of citing articles
pubTypeArrayPublication types (e.g. research-article)
fullTextUrlsArrayFull-text access URLs
europePmcUrlStringEurope PMC article page URL
scrapedAtStringISO 8601 scrape timestamp

Example Input

{
"mode": "search",
"query": "malaria",
"maxItems": 10
}
{
"mode": "search",
"query": "CRISPR gene editing",
"source": "MED",
"isOpenAccess": true,
"sortBy": "cited",
"fromYear": 2018,
"maxItems": 50
}
{
"mode": "byPMID",
"pmid": "25781006"
}
{
"mode": "citations",
"pmid": "25781006",
"maxItems": 100
}

Advanced Search Queries

Europe PMC supports field-specific searches:

  • TITLE:malaria — search in title only
  • ABSTRACT:vaccine — search in abstract only
  • AUTH:Smith — filter by author name
  • JOURNAL:"Nature" — filter by journal name
  • OPEN_ACCESS:y — open access only (same as isOpenAccess: true)
  • HAS_PDF:y — filter to records with PDF available
  • SRC:PPR — preprints only

Combine with AND, OR, NOT:

TITLE:malaria AND OPEN_ACCESS:y AND PUB_YEAR:[2020 TO 2024]

FAQs

Is an API key required? No. The Europe PMC API is fully public and requires no authentication.

What is the rate limit? The actor applies a polite 1-second delay between requests to avoid overloading the server.

Can I get the full text of articles? The fullTextUrls field contains links to full-text versions where available. Open access articles typically provide free HTML and PDF access.

How do I get citations for an article? Use mode=citations with the article's PMID. The actor returns all articles that cite the specified paper.

What is the difference between PMC and MED sources? MED (MEDLINE/PubMed) contains abstracts and metadata. PMC (PubMed Central) contains full-text articles deposited in Europe PMC.

Are preprints included? Yes — use source=PPR to filter specifically for preprints (bioRxiv, medRxiv, etc.) or include them in general searches.

How far back does the data go? Coverage varies by source. MEDLINE includes articles from the 1940s onward; some journals have data from the early 1800s.

How many records can I retrieve? Set maxItems up to 1000 per run. For highly cited search terms, the API returns millions of records — use fromYear/toYear and source filters to narrow results.