PubMed Abstract Scraper
Pricing
$8.00 / 1,000 results
PubMed Abstract Scraper
Scrape PubMed abstracts by keyword with optional date filtering. Returns title, authors, DOI, abstract, journal, and publication date as structured JSON.
Pricing
$8.00 / 1,000 results
Rating
0.0
(0)
Developer
azureblue
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
3 days ago
Last modified
Categories
Share
Extract structured PubMed abstracts by keyword — with optional date filtering, DOI links, and author lists.
Search PubMed's 35+ million biomedical citations and retrieve clean, structured JSON output: title, authors, DOI, abstract text, journal name, and publication date. No API key required.
What does this Actor do?
This Actor queries the NCBI PubMed database using the official E-utilities API. Given a search keyword (supports MeSH terms, gene names, author filters, and free-text queries), it retrieves abstracts and returns them as structured JSON records — one object per article.
Ideal for systematic literature reviews, research trend analysis, citation monitoring, and building medical knowledge bases.
Use Cases
1. Systematic Literature Review Automation
A researcher studying COVID-19 long-term effects needs 500 recent abstracts. Input: keyword: "long COVID symptoms", dateFrom: "2021-01-01", maxResults: 500. The Actor returns all matching abstracts in minutes instead of hours of manual PubMed browsing — ready to import into Zotero, Rayyan, or Excel.
2. Competitor & Drug Pipeline Monitoring
A pharma team tracks new publications about a competitor's drug every week. They run the Actor with keyword: "semaglutide cardiovascular" on a schedule and feed the output into a dashboard to detect new trial results, safety signals, or opinion pieces the moment they are published.
3. Medical Education Content Generation
A medical education platform wants to keep its question bank up to date. The Actor scrapes the latest guidelines (e.g. keyword: "ESC heart failure guidelines 2024") and the abstracts are fed to an LLM pipeline that generates new MCQ questions — ensuring currency without manual curation.
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
keyword | String | ✅ Yes | — | PubMed search query (MeSH, gene, free text, etc.) |
maxResults | Integer | No | 100 | Max abstracts to retrieve (1–10,000) |
dateFrom | String | No | — | Filter: published on/after this date (YYYY-MM-DD) |
dateTo | String | No | today | Filter: published on/before this date (YYYY-MM-DD) |
Example Input
{"keyword": "myocardial infarction reperfusion therapy","maxResults": 50,"dateFrom": "2022-01-01","dateTo": "2024-12-31"}
Output
Each result is saved to the Default Dataset as a JSON object:
{"pmid": "38123456","title": "Outcomes of primary PCI vs thrombolysis in STEMI: a meta-analysis","authors": ["Müller A", "Schmidt B", "Jensen C"],"doi": "10.1016/j.jacc.2023.11.042","abstract": "Background: Primary percutaneous coronary intervention (PCI) is the standard of care for ST-elevation myocardial infarction... Conclusions: Primary PCI significantly reduces 30-day mortality compared to thrombolysis (OR 0.63, 95% CI 0.54–0.74).","pubDate": "2024-01-15","journal": "Journal of the American College of Cardiology"}
Pricing
This Actor uses Pay-Per-Result pricing at $0.008 per scraped abstract.
| Volume | Estimated Cost |
|---|---|
| 100 abstracts | ~$0.80 |
| 1,000 abstracts | ~$8.00 |
| 10,000 abstracts | ~$80.00 |
Technical Details
- Data source: NCBI E-utilities API (official, public, no key required)
- Rate limiting: 1 request/second (conservative; NCBI allows 3/sec without key)
- Retry logic: Up to 3 automatic retries with exponential backoff
- Batch size: 20 articles per API call (NCBI limit)
- Output format: Structured JSON, one object per abstract
- Node.js: v22 LTS
Supported Query Syntax
PubMed supports advanced search syntax. Examples:
BRCA1[gene] AND cancer— gene name filterSmith J[author] AND cardiology— author filter"heart failure"[MeSH] AND randomized controlled trial[pt]— MeSH + publication typeaspirin AND (myocardial infarction OR stroke)— boolean logic
Support
Issues or feature requests? Open a ticket via the Apify console.