Clinical Trial Investigator and Site Intelligence
Pricing
Pay per usage
Clinical Trial Investigator and Site Intelligence
Find enriched clinical trial investigators and deterministic site-fit scores from ClinicalTrials.gov, NPI, OpenPayments, and PubMed data.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
George Kioko
Maintained by CommunityActor stats
0
Bookmarked
1
Total users
1
Monthly active users
12 days ago
Last modified
Categories
Share
CROs pay six figures for investigator + site fit feeds. The raw data is public. The work is the join.
This actor turns ClinicalTrials.gov study records into enriched investigator profiles and scored trial site rosters. It joins CT.gov study, location, sponsor, phase, and condition data with NPI registry matches, OpenPayments payment summaries, and PubMed publication counts. The output is built for CRO feasibility teams, sponsor diligence, patient recruitment planning, and business development teams that need a clean feed instead of raw trial JSON.
Quick start
Find enriched investigators for a condition:
$curl "https://<standby-url>/investigators?condition=glioblastoma&phase=phase2&limit=3"
Score United States sites for a condition:
$curl "https://<standby-url>/sites?condition=breast+cancer&country=United+States&limit=3"
Batch mode also works from an Apify run input:
{"mode": "investigators","condition": "glioblastoma","phase": "phase2","limit": 25}
Standby endpoints
| Endpoint | What it returns |
|---|---|
GET / and GET /health | Service info and endpoint list |
GET /investigators?condition=&phase=&status=&limit= | Enriched investigator profiles across matching studies |
GET /investigator?npi= | One NPI based investigator profile |
GET /investigator?name= | One name based investigator profile with trial history |
GET /sites?condition=&country=&state=&limit= | Scored facility roster |
GET /study?nct= | One expanded study with investigators and scored sites |
POST /investigators/bulk | Up to 100 NPI based profiles |
Health probes using values such as test, ping, example.com, or URLs on known test hostnames return a mocked clinical trial shaped response and do not charge.
Investigator schema
| Field | Meaning |
|---|---|
investigator_id | NPI when found, otherwise hash of name and affiliation |
name, first_name, last_name, credentials | Public investigator identity from CT.gov and NPI |
npi, primary_taxonomy | NPI registry match and primary specialty |
affiliations | Trial facilities and sponsors seen in CT.gov |
city, state, country | Best available NPI or CT.gov location |
active_trial_count, completed_trial_count, total_trial_count | Trial experience counters |
phase_breakdown | Counts for phase 1 through phase 4 |
therapeutic_areas | Top condition terms from matched trials |
open_payments_total_usd | Latest OpenPayments general payment total when available |
open_payments_top_companies | Top manufacturers or GPOs by payment amount |
publications_pubmed_count | PubMed author search count |
first_trial_date, last_trial_date | Earliest and latest observed trial dates |
trial_history | NCT level history used to build the profile |
fetched_at | ISO timestamp |
Site schema
| Field | Meaning |
|---|---|
facility_id | Hash of facility, city, state, and country |
facility_name, city, state, country | CT.gov site location |
trial_count_3y | Recent trial proxy from CT.gov dates |
active_trial_count | Active, recruiting, or enrolling studies |
condition_match_count | Studies matching the requested condition |
phase_3_4_share | Share of trials in phase 3 or phase 4 |
investigators_count_unique | Unique public investigator names linked to the site |
principal_investigators | Top three names by trial count |
site_fit_score | Deterministic score from 0 to 100 |
score_band | low, medium, high, or elite |
score_rationale | Short explanation of the score |
fetched_at | ISO timestamp |
Data flow
flowchart LRA[Input condition, phase, status, NPI, or NCT] --> B[ClinicalTrials.gov search]B --> C[Extract investigators and facilities]C --> D[NPI registry match]C --> E[OpenPayments summary]C --> F[PubMed paper count]D --> G[Normalize investigator profiles]E --> GF --> GC --> H[Score trial sites]G --> I[Dataset and API response]H --> I
Scoring
Site scoring is deterministic. Every site starts at 30 points. It receives 20 points for at least three active trials, 15 points for at least two condition matched trials, 15 points when phase 3 or phase 4 share is at least 0.4, 10 points for at least three unique investigators, and 10 points for at least 10 recent trials. Scores are capped at 100. Bands are low from 0 to 30, medium from 31 to 55, high from 56 to 80, and elite at 81 or above.
Pricing
| Event | Price | Charged when |
|---|---|---|
| Actor start | $1.00 | Once per paid Standby request or batch run |
| Investigator profile | $0.10 | Per enriched investigator profile returned |
| Site fit score | $0.50 | Per scored site row returned |
Charges fire only after data work succeeds and rows are pushed to the dataset.
Comparison
| Option | Best for | Tradeoff |
|---|---|---|
| ClinicalTrials.gov direct | Raw study and location data | No NPI join, no OpenPayments summary, no scoring |
| Veeva or Medidata | Enterprise feasibility programs | SaaS contracts, sales process, and less flexible API use |
| This API | Fast investigator and site feeds | Public data only, deterministic scoring, no private contact scraping |
Use cases
- CRO RFP response: build a quick evidence base for proposed investigators and sites.
- Patient recruitment site targeting: rank facilities before outreach spend.
- BD outreach to investigators: identify public trial experience and publication depth.
- Sponsor diligence: check whether a target investigator has relevant trial history.
- KOL mapping: combine trial count, therapeutic areas, and PubMed footprint.
FAQ
How complete is NPI coverage? NPI coverage is strongest for United States physicians. Non US investigators usually return npi: null.
Why can OpenPayments be null? CMS payment data has publication lag and applies to covered US recipients. No match is returned as null.
Are there rate limits? CT.gov is public fair use. NPI and PubMed are low rate public APIs, so the actor uses bounded requests and accepts partial enrichment.
Can I get a refund for mock probes? Health check payloads return mocked data and are not charged.
Does this include private emails? No. It uses public CT.gov, NPI, OpenPayments, and PubMed data only.
Who do I contact for custom fields? Use the Apify actor issue tab or contact the actor owner through the Apify Store profile.