ClinicalTrials.gov Sponsor Pipeline Scraper
Pricing
from $10.00 / 1,000 results
ClinicalTrials.gov Sponsor Pipeline Scraper
Scrape ClinicalTrials.gov API v2 by sponsor, condition, phase, and recruitment status. Returns one digest row per saved query with study-level evidence — for clinical landscape research and sponsor pipeline analytics. No email or contact fields emitted (Terms of Use compliant).
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
This actor is intended for research and analysis purposes only. Data must not be used for unrequested direct messaging to sponsors, investigators, or individual contacts. Comply with ClinicalTrials.gov Terms of Use and any applicable institutional policies.
Track sponsor pipelines, recruiting status changes, trial phases, and condition-specific activity directly from ClinicalTrials.gov. Use recurring runs to monitor public study records for research, investment diligence, CRO planning, competitive intelligence, or clinical landscape analysis.
When you run the scraper, you get clean digest rows with NCT IDs, official study titles, sponsor names, conditions, trial phase, recruitment status, new-study flags, watch-term hits, and action-needed summaries. The actor does not extract or promote personal contact fields.
Changelog
- v0.2 compliance update — Removed public messaging around personal-contact extraction and direct messaging. Output is positioned for research and pipeline analysis only.
Store Quickstart
Run this actor with your target input. Results appear in the Apify Dataset and can be piped to webhooks for real-time delivery. Use dryRun to validate before committing to a schedule.
Key Features
- 📈 Sponsor pipeline tracking — Group public trial records by sponsor, condition, phase, and status
- 📊 Recruitment change detection — Flag newly recruiting studies and status changes between scheduled runs
- 🎯 Watchlist queries — Monitor condition, sponsor, institution, phase, and geography filters
- 📡 Webhook delivery — Send research digests to analytics or operations systems
Use Cases
| Who | Why |
|---|---|
| Developers | Automate recurring data fetches without building custom scrapers |
| Data teams | Pipe structured output into analytics warehouses |
| Ops teams | Monitor changes via webhook alerts |
| Product managers | Track competitor/market signals without engineering time |
Input
| Field | Type | Default | Description |
|---|---|---|---|
watchlist | array | required | One entry per monitored query. At minimum set id, name, and condition. Add recruitmentStatus, phase, intervention, or sponsor to narrow. |
watchTerms | string | — | Comma-separated sponsor / PI / institution names to flag in study digests. Any matching study receives a watch_term_hit signal tag. |
maxStudiesPerQuery | integer | 50 | Upper bound on studies fetched per query per run. Increase for one-off discovery; keep low for recurring digest runs. |
delivery | string | "dataset" | dataset stores results in the Apify dataset. webhook posts the digest JSON to webhookUrl. |
webhookUrl | string | — | POST target for trial digest payload. Leave empty for dataset delivery. |
datasetMode | string | "all" | all emits every query digest row. action_needed emits only queries with watch-term hits or new recruiting studies. new_only emits only queries with studies not seen in the previous run. |
snapshotKey | string | "clinical-trials-monitor-state" | Stable key used to persist seen NCT IDs across recurring runs. Use the same key across scheduled runs. |
clinicalTrialsApiUrl | string | "https://clinicaltrials.gov/api/v2/studies" | ClinicalTrials.gov API v2 studies endpoint. No API key required. |
requestTimeoutSeconds | integer | 30 | HTTP request timeout. |
notifyOnNoNew | boolean | true | When true, every query produces a digest row even if no new studies were found. |
dryRun | boolean | false | Validate and fetch without persisting state or posting webhooks. |
Example 1 — single oncology watchlist with sponsor watch terms
{"watchlist": [{"id": "nsclc-phase3-recruiting","name": "NSCLC Phase 3 — Recruiting","condition": "non-small cell lung cancer","recruitmentStatus": "RECRUITING","phase": "PHASE3,PHASE4"}],"watchTerms": "Pfizer, AstraZeneca, Novo Nordisk","maxStudiesPerQuery": 50,"delivery": "dataset","datasetMode": "all"}
Example 2 — sponsor portfolio across two indications, action-needed only
{"watchlist": [{"id": "merck-onc-recruiting","name": "Merck — Oncology Recruiting","condition": "cancer","sponsor": "Merck","recruitmentStatus": "RECRUITING"},{"id": "merck-vax-active","name": "Merck — Vaccines Active","condition": "vaccine","sponsor": "Merck","recruitmentStatus": "ACTIVE_NOT_RECRUITING,RECRUITING"}],"watchTerms": "Merck, MSD, Merck Sharp & Dohme","maxStudiesPerQuery": 100,"delivery": "dataset","datasetMode": "action_needed"}
Example 3 — webhook delivery to a research-team listener (new studies only)
{"watchlist": [{"id": "obesity-glp1","name": "Obesity GLP-1","condition": "obesity","intervention": "GLP-1","recruitmentStatus": "RECRUITING"}],"watchTerms": "Novo Nordisk, Eli Lilly","maxStudiesPerQuery": 80,"delivery": "webhook","webhookUrl": "https://your-listener.example.com/clinical-trials","datasetMode": "new_only"}
Output
| Field | Type | Description |
|---|---|---|
meta | object | |
errors | array | |
digests | array | |
digests[].queryId | string | |
digests[].queryName | string | |
digests[].condition | string | |
digests[].recruitmentStatusFilter | array | |
digests[].checkedAt | timestamp | |
digests[].status | string | |
digests[].newStudyCount | number | |
digests[].totalStudyCount | number | |
digests[].recruitingCount | number | |
digests[].changedSinceLastRun | boolean | |
digests[].actionNeeded | boolean | |
digests[].recommendedAction | string | |
digests[].topSponsors | array | |
digests[].watchTermHits | array | |
digests[].signalTags | array | |
digests[].studies | array | |
digests[].error | null |
Output Example
{"meta": {"generatedAt": "2026-04-15T09:00:00.000Z","now": "2026-04-15T09:00:00.000Z","queryCount": 2,"totalStudies": 7,"newStudies": 4,"watchTermHitCount": 2,"actionNeededCount": 1,"snapshot": {"key": "clinical-trials-monitor-sample","loadedFrom": "local","savedTo": "local"},"warnings": [],"executiveSummary": {"overallStatus": "action_needed","brief": "1 query(s) have sponsor watch-term hits requiring review.","topSponsors": [{"name": "Pfizer Inc","studyCount": 2,"isWatchTermHit": true},{"name": "Novo Nordisk A/S","studyCount": 1,"isWatchTermHit": true},{"name": "AstraZeneca","studyCount": 2,"isWatchTermHit": false}],"watchTermHits": [{"term": "Pfizer","studyId": "NCT05001234","sponsor": "Pfizer Inc","title": "Study of [Drug] in Advanced NSCLC","phase": "PHASE3"}]}}}
No email or contact-detail fields are emitted. This is intentional and aligned with ClinicalTrials.gov Terms; see docs/source-compliance.md.
API Usage
Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.
cURL
curl -X POST "https://api.apify.com/v2/acts/taroyamada~clinical-trials-pipeline-monitor/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{ "watchlist": [{ "id": "demo", "name": "Diabetes — Recruiting", "condition": "diabetes", "recruitmentStatus": "RECRUITING" }], "maxStudiesPerQuery": 50, "delivery": "dataset" }'
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("taroyamada/clinical-trials-pipeline-monitor").call(run_input={"watchlist": [{"id": "demo","name": "Diabetes — Recruiting","condition": "diabetes","recruitmentStatus": "RECRUITING"}],"maxStudiesPerQuery": 50,"delivery": "dataset"})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('taroyamada/clinical-trials-pipeline-monitor').call({watchlist: [{ id: 'demo', name: 'Diabetes — Recruiting', condition: 'diabetes', recruitmentStatus: 'RECRUITING' }],maxStudiesPerQuery: 50,delivery: 'dataset',});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Tips
- Run weekly for trend tracking; daily for catalyst-event monitoring.
- Use
webhookdelivery to push digests into research-team channels (Slack, Teams) for review — not for unsolicited contact, see compliance note above. - Archive results in the Apify Dataset for your own historical trend analysis.
- Start with a small watchlist; iterate on
conditionandrecruitmentStatusprecision before scaling.
FAQ
Does this scrape the ClinicalTrials.gov website HTML?
No. It uses the official clinicaltrials.gov/api/v2/studies JSON API. No API key is required.
How is data deduplicated across runs?
The actor persists seen NCT IDs by snapshotKey. Use the same key across scheduled runs to make new_only and action_needed modes meaningful.
Why are there no email fields?
ClinicalTrials.gov Terms prohibit using email addresses from study records for marketing or promotional purposes. To stay compliant by design, this actor emits no email field. See docs/source-compliance.md for the full source-compliance record.
Can I get sponsor name normalisation?
Sponsor canonicalisation (e.g., "Merck Sharp & Dohme" / "MSD" / "Merck & Co." reconciled) is on the v0.3 roadmap.
Can I run this on a schedule?
Yes — use Apify's scheduling UI, or trigger via the API on your own cron. The actor is designed to deduplicate against snapshotKey so recurring runs only highlight new or changed studies.
Related actors
Public-data B2B research cluster — adjacent Apify scrapers from this account:
- TED, SAM.gov & Grants Monitor — Public-sector procurement / tender monitoring for teams that already work with public data sources.
Cost
Pay Per Event:
actor-start: $0.01 (flat fee per run)dataset-item: $0.003 per output item
Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01
No subscription required — you only pay for what you use.
⭐ Was this helpful?
If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.
Bug report or feature request? Open an issue on the Issues tab of this actor.