# Clinical Trial Investigator and Site Intelligence (`george.the.developer/clinical-trial-investigator-intel`) Actor

Find enriched clinical trial investigators and deterministic site-fit scores from ClinicalTrials.gov, NPI, OpenPayments, and PubMed data.

- **URL**: https://apify.com/george.the.developer/clinical-trial-investigator-intel.md
- **Developed by:** [George Kioko](https://apify.com/george.the.developer) (community)
- **Categories:** Business, Lead generation
- **Stats:** 1 total users, 1 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Clinical Trial Investigator and Site Intelligence

CROs pay six figures for investigator + site fit feeds. The raw data is public. The work is the join.

This actor turns ClinicalTrials.gov study records into enriched investigator profiles and scored trial site rosters. It joins CT.gov study, location, sponsor, phase, and condition data with NPI registry matches, OpenPayments payment summaries, and PubMed publication counts. The output is built for CRO feasibility teams, sponsor diligence, patient recruitment planning, and business development teams that need a clean feed instead of raw trial JSON.

### Quick start

Find enriched investigators for a condition:

```bash
curl "https://<standby-url>/investigators?condition=glioblastoma&phase=phase2&limit=3"
````

Score United States sites for a condition:

```bash
curl "https://<standby-url>/sites?condition=breast+cancer&country=United+States&limit=3"
```

Batch mode also works from an Apify run input:

```json
{
  "mode": "investigators",
  "condition": "glioblastoma",
  "phase": "phase2",
  "limit": 25
}
```

### Standby endpoints

| Endpoint | What it returns |
| --- | --- |
| `GET /` and `GET /health` | Service info and endpoint list |
| `GET /investigators?condition=&phase=&status=&limit=` | Enriched investigator profiles across matching studies |
| `GET /investigator?npi=` | One NPI based investigator profile |
| `GET /investigator?name=` | One name based investigator profile with trial history |
| `GET /sites?condition=&country=&state=&limit=` | Scored facility roster |
| `GET /study?nct=` | One expanded study with investigators and scored sites |
| `POST /investigators/bulk` | Up to 100 NPI based profiles |

Health probes using values such as `test`, `ping`, `example.com`, or URLs on known test hostnames return a mocked clinical trial shaped response and do not charge.

### Investigator schema

| Field | Meaning |
| --- | --- |
| `investigator_id` | NPI when found, otherwise hash of name and affiliation |
| `name`, `first_name`, `last_name`, `credentials` | Public investigator identity from CT.gov and NPI |
| `npi`, `primary_taxonomy` | NPI registry match and primary specialty |
| `affiliations` | Trial facilities and sponsors seen in CT.gov |
| `city`, `state`, `country` | Best available NPI or CT.gov location |
| `active_trial_count`, `completed_trial_count`, `total_trial_count` | Trial experience counters |
| `phase_breakdown` | Counts for phase 1 through phase 4 |
| `therapeutic_areas` | Top condition terms from matched trials |
| `open_payments_total_usd` | Latest OpenPayments general payment total when available |
| `open_payments_top_companies` | Top manufacturers or GPOs by payment amount |
| `publications_pubmed_count` | PubMed author search count |
| `first_trial_date`, `last_trial_date` | Earliest and latest observed trial dates |
| `trial_history` | NCT level history used to build the profile |
| `fetched_at` | ISO timestamp |

### Site schema

| Field | Meaning |
| --- | --- |
| `facility_id` | Hash of facility, city, state, and country |
| `facility_name`, `city`, `state`, `country` | CT.gov site location |
| `trial_count_3y` | Recent trial proxy from CT.gov dates |
| `active_trial_count` | Active, recruiting, or enrolling studies |
| `condition_match_count` | Studies matching the requested condition |
| `phase_3_4_share` | Share of trials in phase 3 or phase 4 |
| `investigators_count_unique` | Unique public investigator names linked to the site |
| `principal_investigators` | Top three names by trial count |
| `site_fit_score` | Deterministic score from 0 to 100 |
| `score_band` | `low`, `medium`, `high`, or `elite` |
| `score_rationale` | Short explanation of the score |
| `fetched_at` | ISO timestamp |

### Data flow

```mermaid
flowchart LR
  A[Input condition, phase, status, NPI, or NCT] --> B[ClinicalTrials.gov search]
  B --> C[Extract investigators and facilities]
  C --> D[NPI registry match]
  C --> E[OpenPayments summary]
  C --> F[PubMed paper count]
  D --> G[Normalize investigator profiles]
  E --> G
  F --> G
  C --> H[Score trial sites]
  G --> I[Dataset and API response]
  H --> I
```

### Scoring

Site scoring is deterministic. Every site starts at 30 points. It receives 20 points for at least three active trials, 15 points for at least two condition matched trials, 15 points when phase 3 or phase 4 share is at least 0.4, 10 points for at least three unique investigators, and 10 points for at least 10 recent trials. Scores are capped at 100. Bands are low from 0 to 30, medium from 31 to 55, high from 56 to 80, and elite at 81 or above.

### Pricing

| Event | Price | Charged when |
| --- | ---: | --- |
| Actor start | $1.00 | Once per paid Standby request or batch run |
| Investigator profile | $0.10 | Per enriched investigator profile returned |
| Site fit score | $0.50 | Per scored site row returned |

Charges fire only after data work succeeds and rows are pushed to the dataset.

### Comparison

| Option | Best for | Tradeoff |
| --- | --- | --- |
| ClinicalTrials.gov direct | Raw study and location data | No NPI join, no OpenPayments summary, no scoring |
| Veeva or Medidata | Enterprise feasibility programs | SaaS contracts, sales process, and less flexible API use |
| This API | Fast investigator and site feeds | Public data only, deterministic scoring, no private contact scraping |

### Use cases

1. CRO RFP response: build a quick evidence base for proposed investigators and sites.
2. Patient recruitment site targeting: rank facilities before outreach spend.
3. BD outreach to investigators: identify public trial experience and publication depth.
4. Sponsor diligence: check whether a target investigator has relevant trial history.
5. KOL mapping: combine trial count, therapeutic areas, and PubMed footprint.

### FAQ

**How complete is NPI coverage?** NPI coverage is strongest for United States physicians. Non US investigators usually return `npi: null`.

**Why can OpenPayments be null?** CMS payment data has publication lag and applies to covered US recipients. No match is returned as null.

**Are there rate limits?** CT.gov is public fair use. NPI and PubMed are low rate public APIs, so the actor uses bounded requests and accepts partial enrichment.

**Can I get a refund for mock probes?** Health check payloads return mocked data and are not charged.

**Does this include private emails?** No. It uses public CT.gov, NPI, OpenPayments, and PubMed data only.

**Who do I contact for custom fields?** Use the Apify actor issue tab or contact the actor owner through the Apify Store profile.

# Actor input Schema

## `mode` (type: `string`):

Batch mode to run outside Standby.

## `condition` (type: `string`):

Clinical condition to search on ClinicalTrials.gov.

## `phase` (type: `string`):

Optional phase filter, such as phase2 or phase3.

## `status` (type: `string`):

Optional status filter.

## `country` (type: `string`):

Optional site country filter.

## `state` (type: `string`):

Optional site state filter.

## `limit` (type: `integer`):

Maximum rows to return.

## `nct` (type: `string`):

NCT identifier for study mode.

## `npi` (type: `string`):

NPI number for investigator mode.

## `name` (type: `string`):

Investigator name for investigator mode.

## `npis` (type: `array`):

Bulk NPI list, maximum 100.

## Actor input object example

```json
{
  "mode": "investigators",
  "condition": "glioblastoma",
  "limit": 25
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("george.the.developer/clinical-trial-investigator-intel").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("george.the.developer/clinical-trial-investigator-intel").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call george.the.developer/clinical-trial-investigator-intel --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=george.the.developer/clinical-trial-investigator-intel",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Clinical Trial Investigator and Site Intelligence",
        "description": "Find enriched clinical trial investigators and deterministic site-fit scores from ClinicalTrials.gov, NPI, OpenPayments, and PubMed data.",
        "version": "1.0",
        "x-build-id": "utOz0fqPcMPRvza4w"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/george.the.developer~clinical-trial-investigator-intel/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-george.the.developer-clinical-trial-investigator-intel",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/george.the.developer~clinical-trial-investigator-intel/runs": {
            "post": {
                "operationId": "runs-sync-george.the.developer-clinical-trial-investigator-intel",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/george.the.developer~clinical-trial-investigator-intel/run-sync": {
            "post": {
                "operationId": "run-sync-george.the.developer-clinical-trial-investigator-intel",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "investigators",
                            "investigator",
                            "sites",
                            "study",
                            "bulk"
                        ],
                        "type": "string",
                        "description": "Batch mode to run outside Standby.",
                        "default": "investigators"
                    },
                    "condition": {
                        "title": "Condition",
                        "type": "string",
                        "description": "Clinical condition to search on ClinicalTrials.gov.",
                        "default": "glioblastoma"
                    },
                    "phase": {
                        "title": "Phase",
                        "type": "string",
                        "description": "Optional phase filter, such as phase2 or phase3."
                    },
                    "status": {
                        "title": "Status",
                        "enum": [
                            "active",
                            "completed"
                        ],
                        "type": "string",
                        "description": "Optional status filter."
                    },
                    "country": {
                        "title": "Country",
                        "type": "string",
                        "description": "Optional site country filter."
                    },
                    "state": {
                        "title": "State",
                        "type": "string",
                        "description": "Optional site state filter."
                    },
                    "limit": {
                        "title": "Limit",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum rows to return.",
                        "default": 25
                    },
                    "nct": {
                        "title": "NCT ID",
                        "type": "string",
                        "description": "NCT identifier for study mode."
                    },
                    "npi": {
                        "title": "NPI",
                        "type": "string",
                        "description": "NPI number for investigator mode."
                    },
                    "name": {
                        "title": "Investigator name",
                        "type": "string",
                        "description": "Investigator name for investigator mode."
                    },
                    "npis": {
                        "title": "NPIs",
                        "type": "array",
                        "description": "Bulk NPI list, maximum 100.",
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```