# NIH RePORTER Scraper (`crawlerbros/nih-reporter-scraper`) Actor

Scrape NIH research projects, publications, and clinical studies from the NIH Research Portfolio Online Reporting Tools (RePORTER). Search by keyword, PI, organization, fiscal year, activity code, and agency.

- **URL**: https://apify.com/crawlerbros/nih-reporter-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Automation, Developer tools, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 4 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## NIH RePORTER Scraper

Extract research projects, publications, and clinical studies from the **NIH Research Portfolio Online Reporting Tools (RePORTER)** — the official database of federally-funded biomedical research in the United States.

Whether you are tracking funded cancer research, finding principal investigators at specific institutions, or monitoring NIH grant activity by agency, this actor gives you structured, export-ready data from NIH's public API.

---

### What You Can Scrape

- **Research Projects** — Full grant details including award amounts, PI names, organization, fiscal year, activity code, and project abstracts.
- **Publications** — NIH-linked PubMed publications with PMIDs and associated grant numbers.
- **Clinical Studies** — NIH-tracked clinical research studies.

---

### Key Features

- Search by keyword across project titles and abstracts
- Filter by NIH Institute or Center (NCI, NHLBI, NIAID, NIMH, and 14 more)
- Filter by grant activity code (R01, R21, T32, K-awards, F-awards, and more)
- Filter by fiscal year (single or multiple years)
- Configure how many records to retrieve (up to 10,000)
- All three modes: research projects, publications, clinical studies
- Direct links to NIH RePORTER and PubMed for every record
- No authentication or API key required

---

### Input Parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `mode` | Select | What to search: `searchProjects`, `searchPublications`, `searchClinicalStudies` | `searchProjects` |
| `searchText` | Text | Keywords to search in titles, abstracts, and terms | `cancer immunotherapy` |
| `fiscalYears` | List | Fiscal years to filter by, e.g. `[2022, 2023]`. Leave empty for all years. | `[]` |
| `agencyIcAdmin` | Select | NIH Institute or Center abbreviation (e.g. `NCI`, `NIMH`, `NIAID`) | Any |
| `activityCode` | Select | NIH grant activity code (e.g. `R01`, `R21`, `T32`) | Any |
| `maxItems` | Integer | Maximum records to return (1–10,000) | `50` |

#### Supported Institutes & Centers

NCI · NHLBI · NIAID · NIMH · NIDDK · NICHD · NIA · NINDS · NIGMS · NEI · NIDCR · NIAAA · NIDA · NIMHD · NHGRI · NIBIB · NCATS

#### Supported Activity Codes

R01 · R03 · R21 · R15 · U01 · P01 · P30 · P50 · T32 · K01 · K08 · K23 · K24 · F31 · F32

---

### Output Fields

#### Research Projects (`searchProjects`)

| Field | Type | Description |
|-------|------|-------------|
| `applId` | String | NIH application ID |
| `coreProjectNum` | String | Core project number (e.g. `R01CA123456`) |
| `projectTitle` | String | Full project title |
| `abstractText` | String | Project abstract |
| `fiscalYear` | Integer | Fiscal year of the award |
| `awardAmount` | Integer | Total award amount in USD |
| `directCostAmt` | Integer | Direct cost amount in USD |
| `indirectCostAmt` | Integer | Indirect cost amount in USD |
| `activityCode` | String | Grant activity code (e.g. `R01`) |
| `organizationName` | String | Awardee institution name |
| `organizationCity` | String | City of the institution |
| `organizationState` | String | State code (e.g. `MD`) |
| `organizationCountry` | String | Country of the institution |
| `piNames` | Array | List of principal investigator full names |
| `piEmails` | Array | PI email addresses (when available) |
| `contactPiName` | String | Name of the contact PI |
| `agencyIcAdmin` | String | NIH Institute or Center code |
| `terms` | Array | Keywords/terms extracted from the project |
| `projectStartDate` | String | Project start date (ISO format) |
| `projectEndDate` | String | Project end date (ISO format) |
| `sourceUrl` | String | Direct link to NIH RePORTER project page |
| `recordType` | String | Always `project` |
| `scrapedAt` | String | ISO timestamp when the record was scraped |

#### Publications (`searchPublications`)

| Field | Type | Description |
|-------|------|-------------|
| `pmid` | String | PubMed ID |
| `applId` | String | Associated NIH application ID |
| `coreProjectNum` | String | Associated grant number |
| `sourceUrl` | String | Direct link to the PubMed article |
| `recordType` | String | Always `publication` |
| `scrapedAt` | String | ISO timestamp |

---

### Example Use Cases

1. **Competitive intelligence** — Find all R01 grants awarded by NCI in 2023 on immunotherapy topics.
2. **Literature review** — Retrieve PubMed IDs for publications linked to NIH grants on a specific topic.
3. **Grant landscape analysis** — Identify top-funded institutions and PIs in a research area.
4. **Policy research** — Track NIH funding trends across fiscal years and institutes.
5. **Academic prospecting** — Find active researchers and their contact information for collaboration.

---

### Frequently Asked Questions

**Do I need an API key or account?**
No. NIH RePORTER is a public API and does not require authentication.

**How many records can I retrieve?**
Up to 10,000 records per run using the `maxItems` parameter.

**Can I filter by multiple fiscal years at once?**
Yes. Set `fiscalYears` to a list like `[2021, 2022, 2023]`.

**What is an activity code?**
Activity codes classify grant mechanisms. For example, `R01` is the standard research project grant, `T32` is an institutional training grant, and `K01` is a career development award.

**What does the `agencyIcAdmin` filter do?**
It limits results to grants administered by a specific NIH Institute or Center. For example, setting it to `NCI` returns only National Cancer Institute grants.

**Are award amounts always available?**
Award amounts are included when reported by NIH. Some records may not include cost data.

**Can I get the full abstract text?**
Yes. The `abstractText` field contains the complete project abstract for each research project.

**What is the `coreProjectNum` field?**
It is the permanent project identifier (e.g. `R01CA123456`) that groups all annual supplements and renewals of the same project together.

**How current is the data?**
The NIH RePORTER API is updated regularly by NIH. The `scrapedAt` timestamp shows when each record was retrieved.

**Is this actor rate-limited?**
The actor includes automatic retry logic with exponential backoff to handle rate limits gracefully.

# Actor input Schema

## `mode` (type: `string`):

Select what type of NIH data to search: research projects, publications, or clinical studies.
## `searchText` (type: `string`):

Keywords to search for in project titles, abstracts, and terms.
## `fiscalYears` (type: `array`):

Filter to specific fiscal years (e.g. [2022, 2023]). Leave empty for all years.
## `agencyIcAdmin` (type: `string`):

Filter by NIH Institute or Center.
## `activityCode` (type: `string`):

Filter by NIH grant activity code.
## `maxItems` (type: `integer`):

Maximum number of records to return.

## Actor input object example

```json
{
  "mode": "searchProjects",
  "searchText": "cancer immunotherapy",
  "maxItems": 50
}
````

# Actor output Schema

## `items` (type: `string`):

Dataset containing all scraped NIH records.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "searchProjects",
    "searchText": "cancer immunotherapy",
    "maxItems": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/nih-reporter-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "searchProjects",
    "searchText": "cancer immunotherapy",
    "maxItems": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/nih-reporter-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "searchProjects",
  "searchText": "cancer immunotherapy",
  "maxItems": 50
}' |
apify call crawlerbros/nih-reporter-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/nih-reporter-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "NIH RePORTER Scraper",
        "description": "Scrape NIH research projects, publications, and clinical studies from the NIH Research Portfolio Online Reporting Tools (RePORTER). Search by keyword, PI, organization, fiscal year, activity code, and agency.",
        "version": "1.0",
        "x-build-id": "zervTzbuRRzwOedxr"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~nih-reporter-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-nih-reporter-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~nih-reporter-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-nih-reporter-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~nih-reporter-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-nih-reporter-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "searchProjects",
                            "searchPublications",
                            "searchClinicalStudies"
                        ],
                        "type": "string",
                        "description": "Select what type of NIH data to search: research projects, publications, or clinical studies.",
                        "default": "searchProjects"
                    },
                    "searchText": {
                        "title": "Search text",
                        "type": "string",
                        "description": "Keywords to search for in project titles, abstracts, and terms."
                    },
                    "fiscalYears": {
                        "title": "Fiscal years",
                        "type": "array",
                        "description": "Filter to specific fiscal years (e.g. [2022, 2023]). Leave empty for all years.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "agencyIcAdmin": {
                        "title": "Agency / Institute",
                        "enum": [
                            "",
                            "NCI",
                            "NHLBI",
                            "NIAID",
                            "NIMH",
                            "NIDDK",
                            "NICHD",
                            "NIA",
                            "NINDS",
                            "NIGMS",
                            "NEI",
                            "NIDCR",
                            "NIAAA",
                            "NIDA",
                            "NIMHD",
                            "NHGRI",
                            "NIBIB",
                            "NCATS"
                        ],
                        "type": "string",
                        "description": "Filter by NIH Institute or Center."
                    },
                    "activityCode": {
                        "title": "Activity code",
                        "enum": [
                            "",
                            "R01",
                            "R03",
                            "R21",
                            "R15",
                            "U01",
                            "P01",
                            "P30",
                            "P50",
                            "T32",
                            "K01",
                            "K08",
                            "K23",
                            "K24",
                            "F31",
                            "F32"
                        ],
                        "type": "string",
                        "description": "Filter by NIH grant activity code."
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of records to return.",
                        "default": 50
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
