Pricing

from $10.00 / 1,000 results

ClinicalTrials.gov Sponsor Pipeline Scraper

Scrape ClinicalTrials.gov API v2 by sponsor, condition, phase, and recruitment status. Returns one digest row per saved query with study-level evidence — for clinical landscape research and sponsor pipeline analytics. No email or contact fields emitted (Terms of Use compliant).

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

naoki anzai

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Changelog

v0.2 compliance update — Removed public messaging around personal-contact extraction and direct messaging. Output is positioned for research and pipeline analysis only.

Store Quickstart

Run this actor with your target input. Results appear in the Apify Dataset and can be piped to webhooks for real-time delivery. Use dryRun to validate before committing to a schedule.

Key Features

📈 Sponsor pipeline tracking — Group public trial records by sponsor, condition, phase, and status
📊 Recruitment change detection — Flag newly recruiting studies and status changes between scheduled runs
🎯 Watchlist queries — Monitor condition, sponsor, institution, phase, and geography filters
📡 Webhook delivery — Send research digests to analytics or operations systems

Use Cases

Who	Why
Developers	Automate recurring data fetches without building custom scrapers
Data teams	Pipe structured output into analytics warehouses
Ops teams	Monitor changes via webhook alerts
Product managers	Track competitor/market signals without engineering time

Input

Field	Type	Default	Description
`watchlist`	array	required	One entry per monitored query. At minimum set `id`, `name`, and `condition`. Add `recruitmentStatus`, `phase`, `intervention`, or `sponsor` to narrow.
`watchTerms`	string	—	Comma-separated sponsor / PI / institution names to flag in study digests. Any matching study receives a `watch_term_hit` signal tag.
`maxStudiesPerQuery`	integer	`50`	Upper bound on studies fetched per query per run. Increase for one-off discovery; keep low for recurring digest runs.
`delivery`	string	`"dataset"`	`dataset` stores results in the Apify dataset. `webhook` posts the digest JSON to `webhookUrl`.
`webhookUrl`	string	—	POST target for trial digest payload. Leave empty for dataset delivery.
`datasetMode`	string	`"all"`	`all` emits every query digest row. `action_needed` emits only queries with watch-term hits or new recruiting studies. `new_only` emits only queries with studies not seen in the previous run.
`snapshotKey`	string	`"clinical-trials-monitor-state"`	Stable key used to persist seen NCT IDs across recurring runs. Use the same key across scheduled runs.
`clinicalTrialsApiUrl`	string	`"https://clinicaltrials.gov/api/v2/studies"`	ClinicalTrials.gov API v2 studies endpoint. No API key required.
`requestTimeoutSeconds`	integer	`30`	HTTP request timeout.
`notifyOnNoNew`	boolean	`true`	When true, every query produces a digest row even if no new studies were found.
`dryRun`	boolean	`false`	Validate and fetch without persisting state or posting webhooks.

{
  "watchlist": [
    {
      "id": "nsclc-phase3-recruiting",
      "name": "NSCLC Phase 3 — Recruiting",
      "condition": "non-small cell lung cancer",
      "recruitmentStatus": "RECRUITING",
      "phase": "PHASE3,PHASE4"
    }
  ],
  "watchTerms": "Pfizer, AstraZeneca, Novo Nordisk",
  "maxStudiesPerQuery": 50,
  "delivery": "dataset",
  "datasetMode": "all"
}

{
  "watchlist": [
    {
      "id": "merck-onc-recruiting",
      "name": "Merck — Oncology Recruiting",
      "condition": "cancer",
      "sponsor": "Merck",
      "recruitmentStatus": "RECRUITING"
    },
    {
      "id": "merck-vax-active",
      "name": "Merck — Vaccines Active",
      "condition": "vaccine",
      "sponsor": "Merck",
      "recruitmentStatus": "ACTIVE_NOT_RECRUITING,RECRUITING"
    }
  ],
  "watchTerms": "Merck, MSD, Merck Sharp & Dohme",
  "maxStudiesPerQuery": 100,
  "delivery": "dataset",
  "datasetMode": "action_needed"
}

Example 3 — webhook delivery to a research-team listener (new studies only)

{
  "watchlist": [
    {
      "id": "obesity-glp1",
      "name": "Obesity GLP-1",
      "condition": "obesity",
      "intervention": "GLP-1",
      "recruitmentStatus": "RECRUITING"
    }
  ],
  "watchTerms": "Novo Nordisk, Eli Lilly",
  "maxStudiesPerQuery": 80,
  "delivery": "webhook",
  "webhookUrl": "https://your-listener.example.com/clinical-trials",
  "datasetMode": "new_only"
}

Output

Field	Type	Description
`meta`	object
`errors`	array
`digests`	array
`digests[].queryId`	string
`digests[].queryName`	string
`digests[].condition`	string
`digests[].recruitmentStatusFilter`	array
`digests[].checkedAt`	timestamp
`digests[].status`	string
`digests[].newStudyCount`	number
`digests[].totalStudyCount`	number
`digests[].recruitingCount`	number
`digests[].changedSinceLastRun`	boolean
`digests[].actionNeeded`	boolean
`digests[].recommendedAction`	string
`digests[].topSponsors`	array
`digests[].watchTermHits`	array
`digests[].signalTags`	array
`digests[].studies`	array
`digests[].error`	null

Output Example

{
  "meta": {
    "generatedAt": "2026-04-15T09:00:00.000Z",
    "now": "2026-04-15T09:00:00.000Z",
    "queryCount": 2,
    "totalStudies": 7,
    "newStudies": 4,
    "watchTermHitCount": 2,
    "actionNeededCount": 1,
    "snapshot": {
      "key": "clinical-trials-monitor-sample",
      "loadedFrom": "local",
      "savedTo": "local"
    },
    "warnings": [],
    "executiveSummary": {
      "overallStatus": "action_needed",
      "brief": "1 query(s) have sponsor watch-term hits requiring review.",
      "topSponsors": [
        {
          "name": "Pfizer Inc",
          "studyCount": 2,
          "isWatchTermHit": true
        },
        {
          "name": "Novo Nordisk A/S",
          "studyCount": 1,
          "isWatchTermHit": true
        },
        {
          "name": "AstraZeneca",
          "studyCount": 2,
          "isWatchTermHit": false
        }
      ],
      "watchTermHits": [
        {
          "term": "Pfizer",
          "studyId": "NCT05001234",
          "sponsor": "Pfizer Inc",
          "title": "Study of [Drug] in Advanced NSCLC",
          "phase": "PHASE3"
        }
      ]
    }
  }
}

No email or contact-detail fields are emitted. This is intentional and aligned with ClinicalTrials.gov Terms; see docs/source-compliance.md.

API Usage

Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.

cURL

curl -X POST "https://api.apify.com/v2/acts/taroyamada~clinical-trials-pipeline-monitor/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "watchlist": [{ "id": "demo", "name": "Diabetes — Recruiting", "condition": "diabetes", "recruitmentStatus": "RECRUITING" }], "maxStudiesPerQuery": 50, "delivery": "dataset" }'

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("taroyamada/clinical-trials-pipeline-monitor").call(run_input={
  "watchlist": [{
    "id": "demo",
    "name": "Diabetes — Recruiting",
    "condition": "diabetes",
    "recruitmentStatus": "RECRUITING"
  }],
  "maxStudiesPerQuery": 50,
  "delivery": "dataset"
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

JavaScript / Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('taroyamada/clinical-trials-pipeline-monitor').call({
  watchlist: [{ id: 'demo', name: 'Diabetes — Recruiting', condition: 'diabetes', recruitmentStatus: 'RECRUITING' }],
  maxStudiesPerQuery: 50,
  delivery: 'dataset',
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Tips

Run weekly for trend tracking; daily for catalyst-event monitoring.
Use webhook delivery to push digests into research-team channels (Slack, Teams) for review — not for unsolicited contact, see compliance note above.
Archive results in the Apify Dataset for your own historical trend analysis.
Start with a small watchlist; iterate on condition and recruitmentStatus precision before scaling.

FAQ

Does this scrape the ClinicalTrials.gov website HTML?

No. It uses the official clinicaltrials.gov/api/v2/studies JSON API. No API key is required.

How is data deduplicated across runs?

The actor persists seen NCT IDs by snapshotKey. Use the same key across scheduled runs to make new_only and action_needed modes meaningful.

Why are there no email fields?

ClinicalTrials.gov Terms prohibit using email addresses from study records for marketing or promotional purposes. To stay compliant by design, this actor emits no email field. See docs/source-compliance.md for the full source-compliance record.

Can I get sponsor name normalisation?

Sponsor canonicalisation (e.g., "Merck Sharp & Dohme" / "MSD" / "Merck & Co." reconciled) is on the v0.3 roadmap.

Can I run this on a schedule?

Yes — use Apify's scheduling UI, or trigger via the API on your own cron. The actor is designed to deduplicate against snapshotKey so recurring runs only highlight new or changed studies.

Public-data B2B research cluster — adjacent Apify scrapers from this account:

TED, SAM.gov & Grants Monitor — Public-sector procurement / tender monitoring for teams that already work with public data sources.

Cost

Pay Per Event:

actor-start: $0.01 (flat fee per run)
dataset-item: $0.003 per output item

Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01

No subscription required — you only pay for what you use.

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.

Bug report or feature request? Open an issue on the Issues tab of this actor.

ClinicalTrials.gov Study Scraper

datahoeven/clinicaltrials-scraper

Fetch clinical trial studies from the official ClinicalTrials.gov v2 API. Filter by condition, intervention, phase, status, sponsor, and country.

Daan Hoeven

ClinicalTrials.gov Scraper

datamule/clinicaltrials-gov-scraper

Scrape clinical-trial records from the official ClinicalTrials.gov v2 REST API. Query by condition, intervention, sponsor, status, phase or free text. One record per study — NCT id, title, status, phase, sponsor, conditions, interventions, enrollment, dates, locations + the raw study. Pay per study.

Datamule

ClinicalTrials.gov Scraper — Trial Pipeline for Pharma

azureblue/clinical-trials-scraper

Scrape ClinicalTrials.gov for clinical studies by condition and status. Returns NCT ID, title, phase, sponsor, enrollment count, start date, and direct URL.

azureblue

Clinical Trials Search — ClinicalTrials.gov to JSON

oblanceolate_mandola/clinical-trials-search

Search ClinicalTrials.gov by condition, drug or sponsor. Trial title, status, phase, sponsor, NCT id as JSON for pharma & biotech AI agents. $4 per 1,000, no coding.

Hassan Hashish

ClinicalTrials.gov — Clinical Study Search

commonelements/clinicaltrials-search

Search ClinicalTrials.gov for clinical studies by condition, intervention, sponsor, status, and location. Returns study design, eligibility, contact information, and enrollment data. No API key required.

Harry Schoeller

ClinicalTrials Sponsor & Site Monitor

noetic_quahog/clinicaltrials-sponsor-site-monitor

Find clinical trials by condition, intervention, sponsor, country, and recent update window from ClinicalTrials.gov.

Noetic Data

ClinicalTrials.gov Study Scraper

automation-lab/clinicaltrials-gov-study-scraper

Monitor ClinicalTrials.gov studies by condition, sponsor, location, status, date window, or NCT ID using the official public API.

Stas Persiianenko

ClinicalTrials.gov API - Clinical Study Data

alizarin_refrigerator-owner/clinicaltrials-gov-api---clinical-study-data

ClinicalTrials.gov API - Clinical Study Data & Trial Registry Access ClinicalTrials.gov data for clinical studies worldwide. Search trials by condition, intervention, sponsor, phase, or location. Track drug development pipelines and find recruiting studies.

The Howlers

ClinicalTrials.gov Clinical Studies Scraper

compute-edge/clinicaltrials-scraper

Extract clinical trial and study data from ClinicalTrials.gov using the official v2 API. Search by condition, intervention, phase, status, and location.