ClinicalTrials.gov Study Scraper avatar

ClinicalTrials.gov Study Scraper

Pricing

Pay per event

Go to Apify Store
ClinicalTrials.gov Study Scraper

ClinicalTrials.gov Study Scraper

Monitor ClinicalTrials.gov studies by condition, sponsor, location, status, date window, or NCT ID using the official public API.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Extract structured clinical trial study records from the official ClinicalTrials.gov v2 API.

This actor is built for monitoring public trial data by condition, sponsor, location, recruitment status, update window, or NCT ID. It returns normalized study rows that are ready for BI tools, alerts, competitive intelligence workflows, and research databases.

What does ClinicalTrials.gov Study Scraper do?

ClinicalTrials.gov Study Scraper searches the public ClinicalTrials.gov API and saves one dataset row per study.

It can:

  • πŸ”Ž Search trials by keyword and medical condition
  • 🏒 Filter by sponsor, collaborator, city, country, or facility text
  • πŸ“Œ Fetch specific trials by NCT ID
  • πŸ“… Monitor records updated in a date window
  • βœ… Filter by recruitment status such as RECRUITING or COMPLETED
  • 🧾 Include the raw source JSON when you need full auditability

Who is it for?

This scraper is useful for teams that need repeatable access to public clinical trial records.

  • Pharma competitive intelligence teams tracking rival pipelines
  • Biotech business development teams watching new studies
  • CRO and recruitment teams monitoring active recruiting studies
  • Healthcare market researchers building disease-area datasets
  • Investors and analysts following sponsor activity
  • Patient advocacy and nonprofit teams watching trial availability
  • Data engineers enriching internal trial databases

Why use this actor?

ClinicalTrials.gov is public, but its nested API response is not always convenient for spreadsheet or BI workflows.

This actor normalizes core modules into flat fields and preserves arrays for sponsors, conditions, interventions, countries, and locations.

It also runs on Apify, so you can schedule it, export results, connect webhooks, or call it from code.

Data source

The actor uses the official ClinicalTrials.gov v2 API.

  • Source: https://clinicaltrials.gov/api/v2/studies
  • Detail endpoint: https://clinicaltrials.gov/api/v2/studies/{NCT_ID}
  • Authentication: not required for public records
  • Browser automation: not used
  • Proxy: not required

Data fields

FieldDescription
nctIdClinicalTrials.gov identifier
urlPublic study URL
briefTitleBrief study title
officialTitleOfficial study title
overallStatusRecruitment / study status
studyTypeInterventional, observational, etc.
phasesTrial phase values
conditionsConditions or diseases
interventionsIntervention type and name
leadSponsorLead sponsor name
collaboratorsCollaborator names
enrollmentCountEnrollment count when available
startDateStart date
primaryCompletionDatePrimary completion date
completionDateCompletion date
lastUpdatePostDateLast update posted date
countriesCountries from facility records
locationsFacility, city, state, country strings
eligibilityCriteriaEligibility text
scrapedAtExtraction timestamp

How much does it cost to scrape ClinicalTrials.gov studies?

The actor uses pay-per-event pricing.

You pay a small start fee for each run and a per-study fee for each dataset row saved.

The default input is intentionally small so first runs are inexpensive.

How to use

  1. Open the actor on Apify.
  2. Enter one or more search filters.
  3. Set maxItems to the number of studies you need.
  4. Run the actor.
  5. Download the dataset as JSON, CSV, Excel, XML, or via API.

Example searches

Use cases include:

  • Recruiting breast cancer trials in the United States
  • Studies sponsored by Pfizer or Novartis
  • Trials updated this month for Alzheimer disease
  • A fixed watchlist of NCT IDs
  • Completed Phase 3 studies in a therapeutic area

Input options

Search terms

Use searchTerms for free-text search terms. Multiple terms are combined with AND.

Condition

Use condition for disease or condition filtering.

Use sponsor to find studies connected to a company, university, hospital, or agency.

Location

Use location for country, state, city, or facility text.

Recruitment statuses

Use ClinicalTrials.gov status values such as:

  • RECRUITING
  • NOT_YET_RECRUITING
  • ACTIVE_NOT_RECRUITING
  • COMPLETED
  • TERMINATED
  • WITHDRAWN

NCT IDs

Use nctIds when you already know the studies you want to enrich.

Last update dates

Use lastUpdateFrom and lastUpdateTo to monitor newly updated records.

Advanced query

Use advancedQuery for ClinicalTrials.gov expressions such as date ranges or field-specific clauses.

Output example

{
"nctId": "NCT05162118",
"url": "https://clinicaltrials.gov/study/NCT05162118",
"briefTitle": "Example Study Title",
"overallStatus": "RECRUITING",
"conditions": ["Breast Cancer"],
"leadSponsor": "Example Sponsor",
"countries": ["United States"],
"scrapedAt": "2026-06-26T00:00:00.000Z"
}

Tips for best results

  • Start with a specific condition and status.
  • Use sponsor filtering for competitive intelligence.
  • Use direct NCT ID mode for enrichment pipelines.
  • Use a date window for scheduled monitoring.
  • Keep includeRaw off unless you need the full source object.

Scheduling and monitoring

You can schedule this actor to run daily, weekly, or monthly.

Common monitoring workflows:

  • Daily newly updated recruiting trials
  • Weekly sponsor pipeline changes
  • Monthly condition landscape exports
  • NCT watchlist enrichment

Integrations

Apify datasets and webhooks make the actor useful in automated workflows.

You can send results to:

  • Google Sheets
  • Snowflake
  • BigQuery
  • Airtable
  • Slack alerts
  • Email digests
  • Internal dashboards
  • Research databases

API usage

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/clinicaltrials-gov-study-scraper').call({
condition: 'breast cancer',
overallStatuses: ['RECRUITING'],
location: 'United States',
maxItems: 100
});
console.log(run.defaultDatasetId);

Python

from apify_client import ApifyClient
client = ApifyClient('MY-APIFY-TOKEN')
run = client.actor('automation-lab/clinicaltrials-gov-study-scraper').call(run_input={
'condition': 'breast cancer',
'overallStatuses': ['RECRUITING'],
'location': 'United States',
'maxItems': 100,
})
print(run['defaultDatasetId'])

cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~clinicaltrials-gov-study-scraper/runs?token=MY-APIFY-TOKEN' \
-H 'Content-Type: application/json' \
-d '{"condition":"breast cancer","overallStatuses":["RECRUITING"],"location":"United States","maxItems":100}'

MCP usage

Use this actor from Claude Desktop, Claude Code, or other MCP-compatible tools through Apify MCP.

Setup for Claude Code:

$claude mcp add --transport http apify "https://mcp.apify.com"

Setup for Claude Desktop, Cursor, or VS Code:

{
"mcpServers": {
"apify": {
"url": "https://mcp.apify.com"
}
}
}

Example prompts:

  • "Find recruiting breast cancer studies in the United States and summarize sponsors."
  • "Monitor new or updated Alzheimer trials this week."
  • "Enrich these NCT IDs and return sponsor, status, and locations."

Legality and responsible use

ClinicalTrials.gov provides public study records through an official API.

You should still use the data responsibly, comply with ClinicalTrials.gov terms, respect applicable privacy and research rules, and avoid representing the output as medical advice.

Troubleshooting

I got no results

Try broadening the condition, removing status filters, or checking spelling. Some combinations are too narrow.

Use the nctIds input for direct lookup by NCT ID.

I need every source field

Turn on includeRaw to attach the full API study object to each row.

Explore related Automation Lab actors at:

FAQ

Does this actor use a browser?

No. It uses the official ClinicalTrials.gov API.

Does it need a proxy?

No proxy is required for normal public API use.

Can I scrape by sponsor?

Yes. Use the sponsor input.

Can I monitor new records?

Yes. Schedule the actor and use the last-update date window.

Can I fetch exact NCT IDs?

Yes. Add them to nctIds.

Changelog

  • 0.1 Initial API-backed ClinicalTrials.gov study extraction.