# FDA Orange Book Scraper (`automation-lab/fda-orange-book-scraper`) Actor

Search public FDA Orange Book / Drugs@FDA records by brand, generic, ingredient, sponsor, or application number for pharma research.

- **URL**: https://apify.com/automation-lab/fda-orange-book-scraper.md
- **Developed by:** [Stas Persiianenko](https://apify.com/automation-lab) (community)
- **Categories:** Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## FDA Orange Book Scraper

Export public FDA Orange Book and Drugs@FDA application records by brand name, generic name, active ingredient, sponsor, or application number.

Use this scraper when you need repeatable, structured FDA drug approval data for regulatory research, portfolio monitoring, generic-drug analysis, or internal pharma intelligence workflows.

### What does FDA Orange Book Scraper do?

FDA Orange Book Scraper queries the public openFDA Drugs@FDA API and saves normalized application-level records to an Apify dataset.

It turns FDA application JSON into export-ready rows with application numbers, sponsor names, product summaries, active ingredients, dosage forms, routes, strengths, marketing statuses, submissions, and openFDA identifiers.

The actor is API-first, so it does not need a browser, login, cookies, or a private FDA account.

### Who is it for?

- 🧪 Regulatory affairs teams checking approved drug applications.
- 💊 Generic-drug portfolio analysts comparing brand and ingredient coverage.
- ⚖️ Pharma IP and market-access teams building patent-cliff research datasets.
- 📊 Competitive-intelligence teams monitoring sponsors and application families.
- 🔬 Healthcare data teams joining FDA application records with internal databases.

### Why use it?

- It provides a simple Apify interface around openFDA Drugs@FDA search.
- It supports buyer-friendly inputs instead of requiring users to remember API field names.
- It saves one normalized dataset row per application record.
- It includes nested products and submissions for downstream auditing.
- It can include the raw openFDA record when your compliance workflow needs source evidence.

### Data source

The actor uses:

- `https://api.fda.gov/drug/drugsfda.json`

This is a public FDA/openFDA endpoint.

No FDA API token is required for normal use.

### What data can you extract?

| Field | Description |
| --- | --- |
| `applicationNumber` | NDA, ANDA, or BLA application number from Drugs@FDA. |
| `sponsorName` | Application sponsor / applicant. |
| `brandNames` | Brand names found in openFDA and product data. |
| `genericNames` | Generic names from openFDA. |
| `activeIngredients` | Active ingredient names from product records. |
| `dosageForms` | Dosage forms across products. |
| `routes` | Administration routes. |
| `strengths` | Product strengths. |
| `marketingStatuses` | Product marketing statuses where provided. |
| `products` | Nested product summaries. |
| `submissions` | Nested submission summaries. |
| `openfda` | Original openFDA identifiers and classification fields. |
| `patentDataAvailable` | Whether patent records were available from the source. |
| `exclusivityDataAvailable` | Whether exclusivity records were available from the source. |

### Search modes

You can search by:

- Brand name.
- Generic name.
- Active ingredient.
- Sponsor / applicant.
- Exact application number.
- Raw openFDA query syntax.

### Input example

```json
{
  "queries": [
    "aspirin",
    { "term": "ibuprofen", "field": "ingredient" },
    { "term": "PFIZER", "field": "sponsor" }
  ],
  "applicationNumbers": ["NDA020639"],
  "searchField": "brand",
  "maxItems": 100,
  "includeRawRecord": false
}
````

### Output example

```json
{
  "searchTerm": "aspirin",
  "searchField": "brand",
  "applicationNumber": "NDA020639",
  "sponsorName": "BAYER HEALTHCARE LLC",
  "brandNames": ["ASPIRIN"],
  "activeIngredients": ["ASPIRIN"],
  "dosageForms": ["TABLET"],
  "routes": ["ORAL"],
  "products": [],
  "submissions": [],
  "patentDataAvailable": false,
  "exclusivityDataAvailable": false
}
```

### How much does it cost to scrape FDA Orange Book data?

This actor uses pay-per-event pricing.

- A small start fee is charged once per run.
- A per-record fee is charged for each FDA application record saved.
- Your final cost depends on the number of matching FDA application records and your Apify plan tier.

For most targeted application-number or brand-name lookups, runs are small and inexpensive.

### How to run it

1. Open the actor on Apify.
2. Add one or more search terms.
3. Choose the default search field.
4. Optionally add exact application numbers.
5. Set `maxItems` to cap the export size.
6. Start the run.
7. Download the dataset as JSON, CSV, Excel, or via API.

### Tips for best results

- Use exact application numbers when you know them.
- Use `ingredient` for portfolio research by active ingredient.
- Use `sponsor` for applicant-level monitoring.
- Use `raw` only when you already know openFDA query syntax.
- Keep `maxItems` low for quick smoke tests.
- Enable `includeRawRecord` for compliance audits or custom transformations.

### Patent and exclusivity fields

The dataset includes patent and exclusivity compatibility fields.

In this version, the reliable public API source is openFDA Drugs@FDA. If patent or exclusivity data is not present in that source, the actor sets:

- `patentDataAvailable: false`
- `patents: []`
- `exclusivityDataAvailable: false`
- `exclusivities: []`

This makes downstream schemas stable while avoiding unreliable scraping of blocked FDA download pages.

### Integrations

You can connect the dataset to:

- Google Sheets for regulatory watchlists.
- Snowflake or BigQuery for pharma analytics.
- CRM enrichment pipelines for sponsor intelligence.
- Internal dashboards that monitor generic-entry opportunities.
- Apify webhooks for scheduled portfolio updates.

### API usage with Node.js

```js
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/fda-orange-book-scraper').call({
  queries: ['aspirin'],
  searchField: 'brand',
  maxItems: 100
});
console.log(run.defaultDatasetId);
```

### API usage with Python

```python
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/fda-orange-book-scraper').call(run_input={
    'queries': ['aspirin'],
    'searchField': 'brand',
    'maxItems': 100,
})
print(run['defaultDatasetId'])
```

### API usage with cURL

```bash
curl -X POST 'https://api.apify.com/v2/acts/automation-lab~fda-orange-book-scraper/runs?token=YOUR_APIFY_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{"queries":["aspirin"],"searchField":"brand","maxItems":100}'
```

### MCP integration

Use Apify MCP to call this scraper from Claude Desktop, Claude Code, or other MCP clients.

MCP URL:

```text
https://mcp.apify.com/?tools=automation-lab/fda-orange-book-scraper
```

Claude Code setup:

```bash
claude mcp add apify-fda-orange-book "https://mcp.apify.com/?tools=automation-lab/fda-orange-book-scraper"
```

Claude Desktop JSON config:

```json
{
  "mcpServers": {
    "apify-fda-orange-book": {
      "url": "https://mcp.apify.com/?tools=automation-lab/fda-orange-book-scraper"
    }
  }
}
```

Example prompts:

- "Export FDA Orange Book records for ibuprofen and summarize the sponsors."
- "Find Drugs@FDA applications for sponsor PFIZER and group by active ingredient."
- "Run an application-number lookup for NDA020639 and return the product strengths."

### Scheduling

For monitoring workflows, schedule the actor daily, weekly, or monthly.

Common schedules include:

- Weekly sponsor monitoring.
- Monthly ingredient portfolio exports.
- Quarterly regulatory database refreshes.

### Data quality notes

The actor reports the data returned by openFDA Drugs@FDA.

It does not provide medical advice.

Always verify regulatory decisions against official FDA systems and primary records.

### Legality and responsible use

This actor uses public FDA/openFDA data.

You are responsible for how you use exported data, including compliance with your organization’s regulatory, medical, and legal review processes.

### FAQ and troubleshooting

#### Why did my search return no rows?

Try a different search mode. For example, use `ingredient` for active ingredients and `application_number` for NDA/ANDA/BLA identifiers.

#### Why are patent arrays empty?

The MVP uses the reliable openFDA Drugs@FDA API. Patent/exclusivity download pages may be unavailable or blocked from automated environments, so the actor marks those fields unavailable when the source does not provide them.

#### How do I get the original FDA JSON?

Set `includeRawRecord` to `true`.

### Related scrapers

Other Automation Lab actors that can support healthcare and regulatory workflows:

- https://apify.com/automation-lab/fda-warning-letters-scraper
- https://apify.com/automation-lab/clinicaltrials-gov-scraper
- https://apify.com/automation-lab/healthcare-contact-finder

### Changelog

Initial version:

- Public openFDA Drugs@FDA search.
- Brand, generic, ingredient, sponsor, application-number, and raw query modes.
- Application, product, submission, and openFDA identifier fields.

### Support

If you need a missing field, include an example application number and describe the workflow you are trying to automate.

### Final note

FDA Orange Book Scraper is designed for practical, repeatable exports, not one-off manual lookups.

Use it whenever your team needs FDA drug application data in a dataset, scheduled job, or API pipeline.

# Actor input Schema

## `queries` (type: `array`):

Drug names, ingredients, sponsors, or raw openFDA search terms. All terms use the selected default search field.

## `searchField` (type: `string`):

How plain string search terms should be interpreted.

## `applicationNumbers` (type: `array`):

Optional NDA/ANDA/BLA application numbers to fetch exactly, for example NDA020639 or ANDA040446.

## `maxItems` (type: `integer`):

Maximum number of application-level records to save across all searches.

## `includeRawRecord` (type: `boolean`):

Attach the original Drugs@FDA JSON object for audits and custom downstream mapping.

## Actor input object example

```json
{
  "queries": [
    "aspirin",
    "ibuprofen"
  ],
  "searchField": "brand",
  "applicationNumbers": [
    "NDA020639"
  ],
  "maxItems": 20,
  "includeRawRecord": false
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "queries": [
        "aspirin",
        "ibuprofen"
    ],
    "searchField": "brand",
    "applicationNumbers": [
        "NDA020639"
    ],
    "maxItems": 20,
    "includeRawRecord": false
};

// Run the Actor and wait for it to finish
const run = await client.actor("automation-lab/fda-orange-book-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "queries": [
        "aspirin",
        "ibuprofen",
    ],
    "searchField": "brand",
    "applicationNumbers": ["NDA020639"],
    "maxItems": 20,
    "includeRawRecord": False,
}

# Run the Actor and wait for it to finish
run = client.actor("automation-lab/fda-orange-book-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "queries": [
    "aspirin",
    "ibuprofen"
  ],
  "searchField": "brand",
  "applicationNumbers": [
    "NDA020639"
  ],
  "maxItems": 20,
  "includeRawRecord": false
}' |
apify call automation-lab/fda-orange-book-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=automation-lab/fda-orange-book-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "FDA Orange Book Scraper",
        "description": "Search public FDA Orange Book / Drugs@FDA records by brand, generic, ingredient, sponsor, or application number for pharma research.",
        "version": "0.1",
        "x-build-id": "BJH0ZIaFzzalDVJ0Y"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/automation-lab~fda-orange-book-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-automation-lab-fda-orange-book-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/automation-lab~fda-orange-book-scraper/runs": {
            "post": {
                "operationId": "runs-sync-automation-lab-fda-orange-book-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/automation-lab~fda-orange-book-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-automation-lab-fda-orange-book-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "queries": {
                        "title": "Search terms",
                        "type": "array",
                        "description": "Drug names, ingredients, sponsors, or raw openFDA search terms. All terms use the selected default search field.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchField": {
                        "title": "Default search field",
                        "enum": [
                            "brand",
                            "generic",
                            "ingredient",
                            "sponsor",
                            "application_number",
                            "raw"
                        ],
                        "type": "string",
                        "description": "How plain string search terms should be interpreted.",
                        "default": "brand"
                    },
                    "applicationNumbers": {
                        "title": "Exact application numbers",
                        "type": "array",
                        "description": "Optional NDA/ANDA/BLA application numbers to fetch exactly, for example NDA020639 or ANDA040446.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Maximum application records",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of application-level records to save across all searches.",
                        "default": 20
                    },
                    "includeRawRecord": {
                        "title": "Include raw openFDA record",
                        "type": "boolean",
                        "description": "Attach the original Drugs@FDA JSON object for audits and custom downstream mapping.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
