ClinicalTrials.gov Studies Extractor avatar

ClinicalTrials.gov Studies Extractor

Under maintenance

Pricing

from $1.00 / 1,000 results

Go to Apify Store
ClinicalTrials.gov Studies Extractor

ClinicalTrials.gov Studies Extractor

Under maintenance

Extract clinical-trial records from ClinicalTrials.gov — one study per row. 588k+ studies, filter by condition, term, or status. Public data, no login.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

xtractoo

xtractoo

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Extract clinical-trial records from ClinicalTrials.gov (API v2) — one study per row — filtered by condition, intervention/term, or trial status. Ideal for pipeline tracking, catalyst calendars, and competitive intelligence.

Built for biotech/pharma investors and analysts, CROs, and life-science researchers.


Why use this actor

  • Whole corpus or a slice — 588k+ studies; pull everything via cursor pagination, or filter to your therapeutic area.
  • One study per row, with a flat header (nct_id, brief_title, overall_status, phase, lead_sponsor, start_date) plus the full raw study object (protocol, derived, results) preserved.
  • No login, no key. Clean JSON API.

Input

FieldTypeDescription
conditionQuerytextDisease/condition, e.g. breast cancer.
termQuerytextOther terms (drug, sponsor, keyword).
statusmulti-selectFilter by overall status (Recruiting, Completed, Terminated, …).
pageSizeint1–1000 (default 1000).
maxItemsint0 = all matching.

status is a dropdown of the valid ClinicalTrials statuses — no need to know the API enum strings.

Output — CLINICAL_TRIAL

Envelope (_input, _source, _scrapedAt) + recordType: "CLINICAL_TRIAL" + flat header fields, then the raw study object verbatim:

{
"_input": "cond=breast cancer; status=RECRUITING",
"_source": "S1-ctgov-v2",
"_scrapedAt": "2026-06-03T10:00:00Z",
"recordType": "CLINICAL_TRIAL",
"nct_id": "NCT01234567",
"brief_title": "...",
"overall_status": "RECRUITING",
"phase": "PHASE2",
"lead_sponsor": "...",
"start_date": "2026-01-01",
"protocolSection": { "...": "..." },
"derivedSection": { "...": "..." },
"hasResults": false
}

How it works

  1. Your filters become v2 query params (query.cond, query.term, filter.overallStatus).
  2. The actor pages through results with the nextPageToken cursor (pageSize up to 1000).
  3. Each study streams into the dataset.

Known limits

  • Public API, no WAF; datacenter IP is fine. Backs off on HTTP 429.
  • Verified live 2026-06-03: totalCount 588,273; cursor pagination via nextPageToken confirmed.