Pricing

$15.00 / 1,000 clinical trial rows

Pharma Research & Clinical Trial Monitor

Pull PubMed papers and ClinicalTrials.gov studies at scale. PMIDs, DOIs, abstracts, MeSH terms, NCT IDs, phases, sponsors, enrollment, primary outcomes, results. One row per record. Pay per row.

Pricing

$15.00 / 1,000 clinical trial rows

Rating

0.0

(0)

Developer

Ken M

Actor stats

Bookmarked

Total users

Monthly active users

13 days ago

Last modified

Pharma Research & Clinical Trial Monitor: PubMed + ClinicalTrials.gov

Pull biomedical literature and clinical trial records at scale. Mixes PubMed papers and ClinicalTrials.gov studies in one run. PMIDs, DOIs, full abstracts, MeSH terms, author affiliations, ORCIDs, journal metadata, NCT IDs, trial phases, sponsors, enrollment, primary outcomes, posted results, and live citation counts via NCBI iCite. One row per record. Pay per row.

Built for pharma competitive intelligence teams, biotech analysts watching pipeline shifts, regulatory affairs staff tracking submissions, medical writers building systematic reviews, KOL mappers profiling investigators, CRO BD teams scouting active sites, science journalists tracing claims, AI teams training biomedical LLMs, and grant writers building reference packs.

Keywords this actor ranks for: pubmed api, pubmed scraper, clinicaltrials.gov api, biomedical literature search, drug pipeline monitor, clinical trial scraper, MeSH term extractor, NCT ID lookup, KOL mapping, pharma competitive intelligence, FDA pipeline tracker, oncology trial monitor, biomedical citation api, pharma BI feed.

Why this actor

Other tools	This actor
PubMed E-utilities raw: free but XML parsing, rate limits, no trial data	Both data sources in one normalized JSON row
ClinicalTrials.gov UI export: 1000 row cap, manual click	Unbounded, programmatic, paginates for you
TrialTrove / Citeline: $20K plus per seat per year	Pay per row, no minimum
Cortellis: enterprise contract only	Pay per row, no contract
BiopharmaCatalyst: free but no historical depth, US only	Global, full history, posted results included
Roll your own scraper: maintain 3 parsers, handle rate limits	Maintained selectors plus iCite enrichment built in

How it works

flowchart LR
    A[PubMed queries<br/>or PMIDs<br/>or CT.gov queries<br/>or NCT IDs] --> B[Source router]
    B --> C[NCBI esearch<br/>term + filters]
    B --> D[CT.gov v2 search<br/>query.term + filters]
    B --> E[Direct PMID list]
    B --> F[Direct NCT ID list]
    C --> G[NCBI efetch<br/>XML batches of 100]
    G --> H[Parse PubmedArticle]
    D --> I[Parse studies]
    E --> G
    F --> J[CT.gov single study]
    H --> K{Enrichment toggles?}
    K -->|fetchAbstracts| L[Full abstract text]
    K -->|fetchMeshTerms| M[MeSH headings + qualifiers]
    K -->|fetchReferences| N[ELink refs + citedin]
    K -->|always on| O[iCite citation counts +<br/>relative citation ratio]
    H --> P[(One row per paper)]
    I --> Q[(One row per trial)]
    J --> Q
    O --> P

PubMed records flow through E-utilities (esearch returns PMIDs, efetch returns XML). ClinicalTrials.gov records come from the v2 REST API (JSON, paginated by token). Both sources are public and free at the API level. iCite citation counts are pulled from the NIH OPB API and joined to PubMed rows automatically.

What you get per row

flowchart LR
    P[Paper row] --> P1[Identity<br/>pmid doi pmcid]
    P --> P2[Title + abstract]
    P --> P3[Authors<br/>names, affiliations, ORCIDs]
    P --> P4[Journal<br/>title ISO ISSN volume issue pages]
    P --> P5[Dates<br/>publicationDate publicationYear]
    P --> P6[Topics<br/>meshTerms keywords]
    P --> P7[Funding<br/>grants by agency]
    P --> P8[Citations<br/>citationCount RCR via iCite]
    T[Trial row] --> T1[Identity<br/>nctId url]
    T --> T2[Status + dates]
    T --> T3[Sponsors<br/>lead + collaborators + class]
    T --> T4[Design<br/>phase studyType allocation masking]
    T --> T5[Cohort<br/>enrollment eligibility sex age]
    T --> T6[Conditions + interventions]
    T --> T7[Outcomes<br/>primary + secondary + timeFrames]
    T --> T8[Locations<br/>facility city country status]
    T --> T9[Results section<br/>when posted]

PMIDs and NCT IDs are stable identifiers. The actor dedupes across runs by both, so a daily cron pulls only new records.

Quick start

Track new oncology trials this week

{
  "clinicalTrialsQueries": ["non small cell lung cancer"],
  "studyStatus": ["RECRUITING", "NOT_YET_RECRUITING"],
  "phases": ["PHASE2", "PHASE3"],
  "dateFrom": "2026-04-29",
  "maxRecords": 200
}

Daily PubMed feed for a therapeutic area

{
  "pubmedQueries": ["GLP-1 receptor agonist obesity"],
  "publicationTypes": ["Clinical Trial", "Randomized Controlled Trial", "Meta-Analysis"],
  "dateFrom": "2026-04-01",
  "fetchAbstracts": true,
  "fetchMeshTerms": true,
  "maxRecords": 100
}

KOL mapping by topic, with citation impact

{
  "pubmedQueries": ["CAR-T cell therapy"],
  "publicationTypes": ["Review", "Clinical Trial"],
  "dateFrom": "2024-01-01",
  "fetchAbstracts": true,
  "fetchMeshTerms": true,
  "fetchReferences": false,
  "maxRecords": 500
}

Direct NCT ID enrichment for a watchlist

{
  "nctIds": ["NCT05123456", "NCT04999111", "NCT05432109"],
  "fetchTrialResults": true
}

Build a reference pack from a list of PMIDs

{
  "pmids": ["38523054", "39122189", "37956789"],
  "fetchAbstracts": true,
  "fetchMeshTerms": true,
  "fetchReferences": true
}

Cross domain pull: papers + trials in one run

{
  "pubmedQueries": ["lecanemab alzheimer"],
  "clinicalTrialsQueries": ["lecanemab"],
  "fetchAbstracts": true,
  "fetchTrialResults": true,
  "maxRecords": 250
}

Sample output

PubMed paper row:

{
  "type": "pubmed",
  "pmid": "38523054",
  "doi": "10.1056/NEJMoa2304146",
  "pmcid": "PMC10923512",
  "title": "Lecanemab in Early Alzheimer's Disease",
  "abstract": "BACKGROUND: The accumulation of soluble and insoluble aggregated amyloid-beta...",
  "authors": [
    {
      "name": "Christopher H van Dyck",
      "lastName": "van Dyck",
      "foreName": "Christopher H",
      "affiliations": ["Yale School of Medicine, New Haven, CT"],
      "orcid": "0000-0002-1234-5678"
    }
  ],
  "journal": "The New England Journal of Medicine",
  "journalIso": "N Engl J Med",
  "issn": "1533-4406",
  "volume": "388",
  "issue": "1",
  "pages": "9-21",
  "publicationYear": 2023,
  "publicationDate": "2023-Jan-5",
  "publicationTypes": ["Journal Article", "Randomized Controlled Trial"],
  "meshTerms": [
    { "term": "Alzheimer Disease", "ui": "D000544", "major": true, "qualifiers": ["drug therapy"] },
    { "term": "Amyloid beta-Peptides", "ui": "D016229", "major": false, "qualifiers": [] }
  ],
  "keywords": ["amyloid", "monoclonal antibody"],
  "grants": [
    { "grantId": "U01 AG006781", "agency": "NIA NIH HHS", "country": "United States" }
  ],
  "language": "eng",
  "url": "https://pubmed.ncbi.nlm.nih.gov/38523054/",
  "citationCount": 1842,
  "relativeCitationRatio": 24.3,
  "fieldCitationRate": 12.1,
  "scrapedAt": "2026-05-06T10:30:00.000Z"
}

Clinical trial row:

{
  "type": "clinical_trial",
  "nctId": "NCT03887455",
  "title": "A Study to Confirm Safety and Efficacy of Lecanemab in Participants With Early Alzheimer's Disease",
  "url": "https://clinicaltrials.gov/study/NCT03887455",
  "status": "ACTIVE_NOT_RECRUITING",
  "startDate": "2019-03-22",
  "primaryCompletionDate": "2022-09-29",
  "completionDate": "2027-10-15",
  "studyType": "INTERVENTIONAL",
  "phases": ["PHASE3"],
  "enrollment": 1795,
  "enrollmentType": "ACTUAL",
  "primaryPurpose": "TREATMENT",
  "leadSponsor": "Eisai Inc.",
  "leadSponsorClass": "INDUSTRY",
  "collaborators": ["Biogen"],
  "conditions": ["Alzheimer Disease", "Early Alzheimer's Disease"],
  "interventions": [
    { "type": "DRUG", "name": "Lecanemab", "description": "10 mg/kg biweekly IV", "otherNames": ["BAN2401"] }
  ],
  "primaryOutcomes": [
    { "measure": "Change from Baseline in CDR-SB at 18 Months", "timeFrame": "Baseline to 18 months" }
  ],
  "locations": [
    { "facility": "Yale School of Medicine", "city": "New Haven", "state": "Connecticut", "country": "United States", "status": "ACTIVE_NOT_RECRUITING" }
  ],
  "locationCount": 234,
  "hasResults": true,
  "scrapedAt": "2026-05-06T10:30:00.000Z"
}

Who uses this

Role	Use case
Pharma CI team	Daily feed of new trials in a therapeutic area, with sponsor and phase, mapped against your portfolio
Biotech analyst	Track when a competitor's trial moves from Phase 2 to Phase 3, or posts results
Regulatory affairs	Pull every paper citing a specific MeSH term in the last quarter for an FDA submission
Medical writer	Build a systematic review reference pack from a query, export with full abstracts and DOIs
KOL mapper	Find the top 50 authors by citation impact in a niche, cross referenced to their trial sites
CRO BD	Identify active investigators by location and condition for site recruitment
Science journalist	Verify a viral health claim against the primary trial result and citing literature
AI / LLM team	Build biomedical training corpora with structured MeSH terms, abstracts, and outcome data
Grant writer	Pull recent funded papers in your topic, complete with NIH grant IDs and agency names
Patent attorney	Prior art sweep across PubMed papers and trial registrations on a drug candidate

Input reference

Field	Type	What it does
`pubmedQueries`	string[]	PubMed Entrez queries. Supports MeSH and field tags: `"breast cancer"[MeSH]`, `pembrolizumab[Title]`.
`clinicalTrialsQueries`	string[]	Free text queries against ClinicalTrials.gov. Matches title, conditions, interventions, sponsor.
`pmids`	string[]	Direct PubMed IDs to fetch. Skips search.
`nctIds`	string[]	Direct ClinicalTrials.gov NCT numbers to fetch.
`dateFrom` / `dateTo`	string	ISO date window. PubMed: publication date. CT.gov: lastUpdatePostDate.
`publicationTypes`	string[]	PubMed publication type filter. Common: Clinical Trial, Meta-Analysis, Review.
`studyStatus`	enum[]	Trial recruitment status filter.
`phases`	enum[]	Trial phase filter.
`studyTypes`	enum[]	Interventional, observational, expanded access.
`fetchAbstracts`	boolean	Include full abstract text in PubMed rows. On by default.
`fetchMeshTerms`	boolean	Parse MeSH headings with UIs and qualifiers. On by default.
`fetchReferences`	boolean	Per paper, fetch reference list and citing PMID list via ELink. Off by default.
`fetchTrialResults`	boolean	Include posted results section for completed trials. On by default.
`maxRecords`	integer	Hard cap on rows per run. 0 means unlimited.
`maxPerQuery`	integer	Cap per individual query before moving to the next.
`ncbiApiKey`	string	NCBI API key for 10 req/s instead of 3 req/s. Recommended for runs over 500 records.
`email`	string	Identifying email for the User-Agent header. NCBI requests this.
`dedupe`	boolean	Skip PMIDs and NCT IDs already pushed in previous runs.
`navigationDelayMs`	integer	Pause between API calls. Default 350 ms keeps you under the 3 req/s limit.

API call

curl -X POST \
  "https://api.apify.com/v2/acts/YOUR_USER~pubmed-clinical-trials-intelligence/runs?token=YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "pubmedQueries": ["semaglutide cardiovascular"],
    "clinicalTrialsQueries": ["semaglutide"],
    "studyStatus": ["RECRUITING", "ACTIVE_NOT_RECRUITING"],
    "phases": ["PHASE3", "PHASE4"],
    "dateFrom": "2026-01-01",
    "fetchAbstracts": true,
    "maxRecords": 100
  }'

Pricing

The first 10 rows per run are free so you can validate the schema before paying. After that, $0.005 per row pushed. PubMed papers and clinical trial rows are charged at the same rate. iCite citation counts, MeSH terms, references, and posted trial results are included at no extra per row charge.

FAQ

Do I need an NCBI API key?

Optional but recommended for runs over 500 records. Without a key, NCBI throttles at 3 requests per second. With a free key from your NCBI account, you get 10 per second. The actor handles backoff either way.

Will this hit rate limits?

The default navigationDelayMs of 350 ms paces requests under NCBI's no key limit. ClinicalTrials.gov v2 has no published rate limit and accepts 100 records per page. If you see 429 errors, raise navigationDelayMs to 700 ms or add an API key.

Why not use BioPython or Entrez Direct?

Both are excellent for one off pulls on your laptop. This actor adds three things: ClinicalTrials.gov in the same row schema, iCite citation counts joined automatically, and dedupe across daily runs. Run it on a cron and you get an incremental feed instead of a one shot dump.

How current is the data?

PubMed indexes new papers within hours of journal publication. ClinicalTrials.gov updates as sponsors post changes (sometimes daily, sometimes monthly per study). Both APIs return the live record at request time.

Can I track when a trial changes phase or status?

Yes. Schedule the actor on a daily cron with the same query and dedupe: false. Each row carries scrapedAt, lastUpdatePostedDate, and status. Diff between snapshots to catch phase transitions, status flips, and enrollment changes.

What is iCite RCR?

Relative Citation Ratio. NIH's field normalized citation impact metric. RCR of 1.0 is average for the paper's field and year. RCR of 5.0 means the paper is cited 5x more than average peers. Better than raw citation count for cross field comparisons.

Can I get the full text of a paper?

The actor returns metadata and the structured abstract. Full text lives behind the publisher or in PubMed Central. For PMC papers, the row includes a pmcid. Pipe pmcid into Apify's Website Content Crawler against https://www.ncbi.nlm.nih.gov/pmc/articles/{pmcid}/ for the full body.

Does fetchReferences work for every paper?

Only papers indexed with a structured reference list in PubMed have references via ELink. Coverage is strongest in PMC open access journals and weaker in older or non English titles. Empty references array means PubMed does not have the reference list, not that the paper has no references.

How does this dedupe?

Two key value store keys: seen-pmids and seen-nct-ids. Every successful push adds the ID. Next run skips IDs already in the set. Turn dedupe off to refresh stale rows or rebuild the dataset from scratch.

Will this scrape PubMed Central full text?

No. PMC full text is XML behind a separate API and the licensing varies per article. Use Website Content Crawler against the pmcid URL when full text is needed.

Google Scholar Scraper. Broader academic coverage including humanities, social sciences, and working papers. Pair when your topic spans biomedical and adjacent fields.
Google Patents Scraper. Same temporal and prior art shape applied to patent literature. Pairs naturally for IP teams covering pharma assets.
SEC 8-K Event Tracker. Catch material events from public biotech sponsors. Pair with this actor to align trial readouts to investor disclosures.
SEC Form 4 Insider Tracker. Insider trading signal around clinical milestones.
Website Content Crawler. Pipe pmcid URLs or trial NCT URLs into the crawler for full text and supplementary documents.
HN Lead Monitor. Catch new mentions of a trial sponsor or drug name on Hacker News.
Reddit Lead Monitor. Same applied to patient and clinician subreddits, useful for KOL discovery and patient sentiment.

ClinicalTrials.gov Scraper - Clinical Trial Data API

pink_comic/clinicaltrials-gov-search

Scrape ClinicalTrials.gov clinical trial data by condition, drug, sponsor, phase, status, or NCT ID. Get structured studies, sites, enrollment, eligibility, interventions, and pharma pipeline signals. No API key needed; pay per result.

Ava Torres

ClinicalTrials.gov Scraper — Trial Pipeline for Pharma

azureblue/clinical-trials-scraper

Scrape ClinicalTrials.gov for clinical studies by condition and status. Returns NCT ID, title, phase, sponsor, enrollment count, start date, and direct URL.

azureblue

ClinicalTrials.gov Scraper

pear_today/clinical-trials-scraper

Scrape clinical trials from ClinicalTrials.gov. Search by condition, drug, sponsor, phase, status. Extract NCT IDs, sponsors, enrollment, eligibility, outcomes, locations. No API key needed. For pharma, biotech, VCs, researchers.

Ugen Dorji

ClinicalTrials.gov Scraper

crawlerbros/clinicaltrials-scraper

Search and extract clinical trial data from ClinicalTrials.gov - conditions, interventions, phases, enrollment, sponsors, locations and status. Free, no authentication required.

Crawler Bros

ClinicalTrials.gov Scraper - Clinical Studies

lulzasaur/clinicaltrials-scraper

Search and extract clinical trial data from ClinicalTrials.gov. Get study details, status, sponsors, conditions, interventions, eligibility, locations, and enrollment info.

lulz bot

ClinicalTrials.gov Search Scraper

moving_beacon-owner1/clinicaltrials-gov-search-scraper

Searches ClinicalTrials.gov for studies matching any keyword query and returns structured trial data including NCT ID, conditions, phases, sponsor, status, interventions, enrollment, and study dates.

Jamshaid Arif

ClinicalTrials.gov Studies Extractor

xtracto/clinicaltrials-studies

Extract clinical-trial records from ClinicalTrials.gov — one study per row. 588k+ studies, filter by condition, term, or status. Public data, no login.

Farhan Febrian Nauval

Clinical Trials Search (ClinicalTrials.gov)

agentictools/clinical-trials-search

Search and monitor clinical trials from ClinicalTrials.gov by condition, term, sponsor, or status. Phases, enrollment, sponsors, locations, and dates.

Ken Agland

Clinical Trials Scraper — NIH ClinicalTrials.gov Drug Studies

copious_atoll/clinical-trials-scraper

Extract clinical trial data from ClinicalTrials.gov. Search by condition, drug, sponsor. Get trial status, phases, enrollment, results, eligibility criteria. Free NIH API, no proxy needed.

Grim R

ClinicalTrials.gov Updates Tracker — by Query

v0iddo/clinicaltrials-updates-tracker

Pull recently-updated ClinicalTrials.gov studies matching a query. One clean row per trial — NCT id, title, status, phases, sponsor, conditions, key dates, locations, study URL. Filter on lastUpdatePostDate for cron-friendly daily watchers. Source: ClinicalTrials.gov v2 API (free, no auth).