ClinicalTrials.gov Studies Extractor avatar

ClinicalTrials.gov Studies Extractor

Pricing

from $1.00 / 1,000 results

Go to Apify Store
ClinicalTrials.gov Studies Extractor

ClinicalTrials.gov Studies Extractor

Extract clinical-trial records from ClinicalTrials.gov — one study per row. 588k+ studies, filter by condition, term, or status. Public data, no login.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

xtractoo

xtractoo

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract clinical-trial records from ClinicalTrials.gov (API v2) — one study per row — filtered by condition, intervention/term, or trial status. Ideal for pipeline tracking, catalyst calendars, and competitive intelligence.

Built for biotech/pharma investors and analysts, CROs, and life-science researchers.


Why use this actor

  • Whole corpus or a slice — 588k+ studies; pull everything via cursor pagination, or filter to your therapeutic area.
  • One study per row, with a flat header (nct_id, brief_title, overall_status, phase, lead_sponsor, start_date) plus the full raw study object (protocol, derived, results) preserved.
  • No login, no key. Clean JSON API.

Input

FieldTypeDescription
conditionQuerytextDisease/condition, e.g. breast cancer.
termQuerytextOther terms (drug, sponsor, keyword).
statusmulti-selectFilter by overall status (Recruiting, Completed, Terminated, …).
pageSizeint1–1000 (default 1000).
maxItemsint0 = all matching.

status is a dropdown of the valid ClinicalTrials statuses — no need to know the API enum strings.

Output — CLINICAL_TRIAL

Envelope (_input, _source, _scrapedAt) + recordType: "CLINICAL_TRIAL" + flat header fields, then the raw study object verbatim:

{
"_input": "cond=breast cancer; status=RECRUITING",
"_source": "S1-ctgov-v2",
"_scrapedAt": "2026-06-03T10:00:00Z",
"recordType": "CLINICAL_TRIAL",
"nct_id": "NCT01234567",
"brief_title": "...",
"overall_status": "RECRUITING",
"phase": "PHASE2",
"lead_sponsor": "...",
"start_date": "2026-01-01",
"protocolSection": { "...": "..." },
"derivedSection": { "...": "..." },
"hasResults": false
}

How it works

  1. Your condition, term, and status filters are applied to the search.
  2. The actor automatically pages through all matching results (up to 1000 per request).
  3. Each study streams into the dataset.

Known limits

  • Public data — no account needed, runs from any connection. Backs off on HTTP 429.
  • Verified live 2026-06-03: totalCount 588,273; cursor pagination via nextPageToken confirmed.