PubMed Research Article Scraper avatar

PubMed Research Article Scraper

Pricing

from $4.00 / 1,000 results

Go to Apify Store
PubMed Research Article Scraper

PubMed Research Article Scraper

Search PubMed with official NCBI E-utilities and export article, author, journal, PMID, DOI, publication type, and biomedical topic rows.

Pricing

from $4.00 / 1,000 results

Rating

0.0

(0)

Developer

naoki anzai

naoki anzai

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

PubMed Research Intelligence

Search PubMed through official NCBI E-utilities and export flattened rows for biomedical articles, authors, and journals. V1 uses JSON ESearch and ESummary endpoints only, so no browser scraping or login is required.

Use cases

Use this actor for biomedical literature research — query PubMed abstracts, MeSH terms, and citations by topic. Auth-free, official-API-first, with a stable output schema and documented source compliance.

Inputs

FieldDefaultNotes
searchTerms[]PubMed queries such as cancer immunotherapy.
pmids[]Direct PubMed IDs to summarize.
fromDate / toDateemptyPublication date filter, YYYY, YYYY/MM, or YYYY/MM/DD.
sortrelevancerelevance, pub_date, or most_recent.
maxResultsPerQuery25PMIDs retrieved for each query.
maxArticles100Global unique article cap.
emailemptyOptional NCBI contact email.
apiKeyemptyOptional NCBI API key for higher rate limits.
deliverydatasetdataset or webhook.
dryRunfalseSkip dataset/webhook delivery.

At least one searchTerms or pmids value is required.

Dataset Rows

article_summary

  • PMID, title, journal, publication date
  • DOI, PMC, publication types, language
  • first/last author, author count
  • PubMed URL and NLM identifiers

author_signal

  • PMID, article title, author name
  • author order, first/last flags, author type

journal_summary

  • PMID, article title, journal name/abbreviation
  • NLM unique ID, ISSN/eISSN, publication date

Example Input

{
"searchTerms": ["cancer immunotherapy", "GLP-1 obesity"],
"fromDate": "2024",
"sort": "relevance",
"maxResultsPerQuery": 10,
"maxArticles": 20,
"delivery": "dataset",
"dryRun": false
}

Sample output

Each run produces structured dataset rows (see the Dataset Rows section above for the field list). Run the actor once with the example input to see a live sample before scheduling.

Local Development

npm install
npm test
node src/index.js

Limitations

  • V1 does not parse EFetch XML abstracts. It intentionally uses ESearch/ESummary JSON for reliable deployment without XML dependencies.
  • NCBI rate limits are stricter without an API key. Use email and apiKey for larger scheduled jobs.
  • PubMed metadata can be incomplete for very new records.

Input Examples

Example: Single-target audit

{
"targets": [
"example-target-1"
],
"maxResultsPerTarget": 30
}

Example: Bulk portfolio

{
"targets": [
"target-1",
"target-2",
"target-3"
],
"maxResultsPerTarget": 50,
"snapshotKey": "pubmed-research-intelligence-state"
}

Example: Recurring delta watch

{
"targets": [
"target-1"
],
"snapshotKey": "pubmed-research-intelligence-state",
"emitChangedOnly": true
}