Clinical Trials API - Normalized ClinicalTrials.gov Data avatar

Clinical Trials API - Normalized ClinicalTrials.gov Data

Pricing

from $3.00 / 1,000 trial records

Go to Apify Store
Clinical Trials API - Normalized ClinicalTrials.gov Data

Clinical Trials API - Normalized ClinicalTrials.gov Data

[💵 $4.00 / 1K] Clean, normalized clinical trials from ClinicalTrials.gov: parsed eligibility (inclusion/exclusion + age), phase, status, sponsors, interventions, outcomes, and flattened study sites with geo. Not a raw study dump.

Pricing

from $3.00 / 1,000 trial records

Rating

0.0

(0)

Developer

WebData Labs

WebData Labs

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

18 hours ago

Last modified

Share

Clean, normalized clinical trials straight from ClinicalTrials.gov. Search by condition, drug, sponsor, location, status, or phase - or pull specific NCT IDs - and get analysis-ready trial records: parsed eligibility, normalized phase and status, sponsors, interventions, outcomes, and flattened study sites with geo coordinates.

ClinicalTrials.gov's raw v2 API hands back a deeply nested study object - 13 protocol modules, eligibility stored as one free-text blob, and a locations array that can exceed 1,000 sites for a single trial. Most "clinical trial scrapers" dump that raw mess on you. This Actor does the hard part: it parses the eligibility free text into inclusion and exclusion lists, converts age strings to numeric years, flattens and dedupes study sites and countries, and pulls phase, status, sponsor, enrollment, and outcomes into one tidy schema - so you get usable rows, not a JSON swamp.

✅ What you get / ❌ what this isn't

This Actor gives youThis Actor is not
One normalized row per trial: status, phase, sponsor, enrollment, datesNot a raw nested study JSON dump
Eligibility parsed into inclusion / exclusion lists + numeric age in yearsNot medical advice or trial recommendations
Flattened, deduped study sites with city, country, status, and lat/lngNot affiliated with or endorsed by ClinicalTrials.gov or the NIH
Search by condition, drug, sponsor, location, status, and phaseNot a source of patient-level or unpublished data
Direct fetch by NCT ID for known trialsNot a guarantee every field is populated by every sponsor
Reliable official-API reads (no anti-bot, no proxy needed)Not a substitute for reading the full protocol record

🔎 Why use this Actor

  • Pull every recruiting trial for a condition or drug in seconds, already normalized.
  • Build site-selection and competitive-intelligence tables across sponsors and phases.
  • Power patient-recruitment search by condition plus geography with clean site rows.
  • Skip the plumbing - eligibility parsing, age conversion, and site flattening are done.
  • Feed clean trial rows into dashboards, newsletters, models, or LLM pipelines.

🗂️ What data you get

One row per trial:

FieldTypeDescription
nctId, urlstringTrial identifier and its ClinicalTrials.gov page
briefTitle, officialTitlestringPublic and official study titles
overallStatusstringRecruitment status (RECRUITING, COMPLETED...)
phase, phasesstring / arrayNormalized trial phase
studyTypestringINTERVENTIONAL or OBSERVATIONAL
conditions, keywordsarrayConditions studied and indexing keywords
interventions, interventionNamesarrayDrug/device/procedure arms
leadSponsor, leadSponsorClass, collaboratorsstring / arrayWho runs and funds the trial
enrollmentCount, enrollmentTypenumber / stringTarget or actual enrollment
startDate, primaryCompletionDate, completionDatestringKey milestone dates
firstPostedDate, lastUpdatePostedDatestringRegistry posting dates
sex, minimumAge, maximumAgestringRaw eligibility fields
minimumAgeYears, maximumAgeYearsnumberAge parsed to numeric years
healthyVolunteers, stdAgesstring / arrayVolunteer policy and age groups
inclusionCriteria, exclusionCriteriaarrayEligibility parsed from free text
primaryOutcomes, secondaryOutcomeCountarray / numberOutcome measures and time frames
briefSummarystringPlain-language study summary
numLocations, numCountries, countriesnumber / arraySite footprint
locationsarrayFlattened sites: facility, city, state, country, status, lat/lng

👥 Who it's for

  • Pharma and biotech competitive-intelligence and portfolio teams.
  • CROs and site-selection teams mapping where trials run and enroll.
  • Patient-recruitment companies searching by condition and geography.
  • Biotech investors and equity researchers tracking pipelines.
  • Academic researchers, systematic reviewers, and data teams feeding clean trial rows downstream.

Example tasks

⚙️ How to get clinical trial data

  1. Open the Actor on Apify.
  2. Enter a condition, searchTerm, intervention, sponsor, or location - or paste specific nctIds.
  3. Optionally filter by recruitment status and phases.
  4. Set how many trials to return (maxItems) and the per-trial site cap (maxLocationsPerTrial).
  5. Run the Actor.
  6. Open the Clinical trials dataset view.
  7. Export JSON, CSV, Excel, HTML, or XML, or call the Actor through the Apify API.

📥 Input

{
"condition": "breast cancer",
"status": ["RECRUITING"],
"phases": ["2", "3"],
"maxItems": 200,
"maxLocationsPerTrial": 50
}

📤 Output

{
"recordType": "trial",
"nctId": "NCT05774678",
"url": "https://clinicaltrials.gov/study/NCT05774678",
"briefTitle": "Trial Of PreoperAtive Radiation (TOPAz)",
"overallStatus": "RECRUITING",
"phase": "PHASE3",
"studyType": "INTERVENTIONAL",
"conditions": ["Breast Cancer"],
"leadSponsor": "M.D. Anderson Cancer Center",
"enrollmentCount": 126,
"startDate": "2023-08-01",
"primaryCompletionDate": "2026-12-31",
"sex": "FEMALE",
"minimumAgeYears": 18.0,
"inclusionCriteria": ["Age 18 years or older"],
"exclusionCriteria": ["Patients undergoing treatment for recurrent breast cancer"],
"numLocations": 1,
"numCountries": 1,
"countries": ["United States"],
"locations": [
{"facility": "M D Anderson Cancer Center", "city": "Houston", "state": "Texas", "country": "United States", "status": "RECRUITING", "lat": 29.76328, "lng": -95.36327}
]
}

💵 How much does it cost?

The launch price is about $4.00 / 1,000 trials, tier-discounted for higher Apify plans. One trial record is one charged result. A search returning 200 trials is 200 results.

🔁 Run it on the Apify platform

Schedule recurring refreshes to track new and updated trials, call it from the Apify API, export to CSV/JSON/Excel, or connect the dataset to Make, Zapier, webhooks, a warehouse, or an LLM pipeline.

⚠️ Limits and caveats

  • This Actor reads ClinicalTrials.gov's public v2 API. It is not affiliated with ClinicalTrials.gov or the NIH and provides no medical advice.
  • Eligibility parsing is best-effort. Inclusion/exclusion lists are split from a free-text blob; unusual sponsor formatting may merge or miss a bullet. The raw fields are always preserved.
  • Site lists can be huge. Some trials register 1,000+ sites; numLocations always reports the full count, while locations embeds up to maxLocationsPerTrial rows.
  • Fields a sponsor did not provide are returned as null or empty arrays rather than guessed.
  • Data reflects what sponsors have registered, which can lag actual trial activity.
  • SEC Financials API - normalized income, balance sheet, and cash flow from EDGAR.
  • Prediction Markets Scraper - odds, prices, and volume across Polymarket, Kalshi, and Manifold.
  • Website Tech Stack Detector - find the tools a company's site runs.
  • Greenhouse, Lever & Ashby Jobs Scraper - open roles across the top startup ATS.

❓ FAQ

How do I search for trials?

Fill any of condition, searchTerm, intervention, sponsor, or location, and optionally filter by status and phases. To pull known trials directly, paste their nctIds instead.

Where does the data come from?

ClinicalTrials.gov's official v2 REST API - the same registry the NIH publishes. It is free, public, and authoritative.

Why are some fields null or empty?

Because the sponsor did not register that field for that trial. Observational studies, expanded-access records, and older entries populate fewer fields than a large interventional trial.

How is age normalized?

The registry stores age as text like 18 Years or 6 Months. This Actor keeps the raw strings and adds minimumAgeYears / maximumAgeYears as numeric years (6 months becomes 0.5) so you can filter and compare.

Does it need a proxy?

No. ClinicalTrials.gov is a public API served from a CDN and works without a proxy. Enable Apify Proxy only on very large batches.

🛠️ Support

For bugs or missing fields, open an Actor issue with the run URL, the search or NCT ID, and the field or behavior you expected.

⭐ Rate this Actor

If this Actor saved you time, please take 30 seconds to leave a review on the Reviews tab of Clinical Trials API - Normalized ClinicalTrials.gov Data - reviews are the main trust signal other users see, and they directly decide which features get built next. If something is broken or a field is missing, please open an issue first - we typically respond within a day and would love the chance to fix it before you rate.