FDA Orange Book Scraper avatar

FDA Orange Book Scraper

Pricing

Pay per event

Go to Apify Store
FDA Orange Book Scraper

FDA Orange Book Scraper

Search public FDA Orange Book / Drugs@FDA records by brand, generic, ingredient, sponsor, or application number for pharma research.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Export public FDA Orange Book and Drugs@FDA application records by brand name, generic name, active ingredient, sponsor, or application number.

Use this scraper when you need repeatable, structured FDA drug approval data for regulatory research, portfolio monitoring, generic-drug analysis, or internal pharma intelligence workflows.

What does FDA Orange Book Scraper do?

FDA Orange Book Scraper queries the public openFDA Drugs@FDA API and saves normalized application-level records to an Apify dataset.

It turns FDA application JSON into export-ready rows with application numbers, sponsor names, product summaries, active ingredients, dosage forms, routes, strengths, marketing statuses, submissions, and openFDA identifiers.

The actor is API-first, so it does not need a browser, login, cookies, or a private FDA account.

Who is it for?

  • 🧪 Regulatory affairs teams checking approved drug applications.
  • 💊 Generic-drug portfolio analysts comparing brand and ingredient coverage.
  • ⚖️ Pharma IP and market-access teams building patent-cliff research datasets.
  • 📊 Competitive-intelligence teams monitoring sponsors and application families.
  • 🔬 Healthcare data teams joining FDA application records with internal databases.

Why use it?

  • It provides a simple Apify interface around openFDA Drugs@FDA search.
  • It supports buyer-friendly inputs instead of requiring users to remember API field names.
  • It saves one normalized dataset row per application record.
  • It includes nested products and submissions for downstream auditing.
  • It can include the raw openFDA record when your compliance workflow needs source evidence.

Data source

The actor uses:

  • https://api.fda.gov/drug/drugsfda.json

This is a public FDA/openFDA endpoint.

No FDA API token is required for normal use.

What data can you extract?

FieldDescription
applicationNumberNDA, ANDA, or BLA application number from Drugs@FDA.
sponsorNameApplication sponsor / applicant.
brandNamesBrand names found in openFDA and product data.
genericNamesGeneric names from openFDA.
activeIngredientsActive ingredient names from product records.
dosageFormsDosage forms across products.
routesAdministration routes.
strengthsProduct strengths.
marketingStatusesProduct marketing statuses where provided.
productsNested product summaries.
submissionsNested submission summaries.
openfdaOriginal openFDA identifiers and classification fields.
patentDataAvailableWhether patent records were available from the source.
exclusivityDataAvailableWhether exclusivity records were available from the source.

Search modes

You can search by:

  • Brand name.
  • Generic name.
  • Active ingredient.
  • Sponsor / applicant.
  • Exact application number.
  • Raw openFDA query syntax.

Input example

{
"queries": [
"aspirin",
{ "term": "ibuprofen", "field": "ingredient" },
{ "term": "PFIZER", "field": "sponsor" }
],
"applicationNumbers": ["NDA020639"],
"searchField": "brand",
"maxItems": 100,
"includeRawRecord": false
}

Output example

{
"searchTerm": "aspirin",
"searchField": "brand",
"applicationNumber": "NDA020639",
"sponsorName": "BAYER HEALTHCARE LLC",
"brandNames": ["ASPIRIN"],
"activeIngredients": ["ASPIRIN"],
"dosageForms": ["TABLET"],
"routes": ["ORAL"],
"products": [],
"submissions": [],
"patentDataAvailable": false,
"exclusivityDataAvailable": false
}

How much does it cost to scrape FDA Orange Book data?

This actor uses pay-per-event pricing.

  • A small start fee is charged once per run.
  • A per-record fee is charged for each FDA application record saved.
  • Your final cost depends on the number of matching FDA application records and your Apify plan tier.

For most targeted application-number or brand-name lookups, runs are small and inexpensive.

How to run it

  1. Open the actor on Apify.
  2. Add one or more search terms.
  3. Choose the default search field.
  4. Optionally add exact application numbers.
  5. Set maxItems to cap the export size.
  6. Start the run.
  7. Download the dataset as JSON, CSV, Excel, or via API.

Tips for best results

  • Use exact application numbers when you know them.
  • Use ingredient for portfolio research by active ingredient.
  • Use sponsor for applicant-level monitoring.
  • Use raw only when you already know openFDA query syntax.
  • Keep maxItems low for quick smoke tests.
  • Enable includeRawRecord for compliance audits or custom transformations.

Patent and exclusivity fields

The dataset includes patent and exclusivity compatibility fields.

In this version, the reliable public API source is openFDA Drugs@FDA. If patent or exclusivity data is not present in that source, the actor sets:

  • patentDataAvailable: false
  • patents: []
  • exclusivityDataAvailable: false
  • exclusivities: []

This makes downstream schemas stable while avoiding unreliable scraping of blocked FDA download pages.

Integrations

You can connect the dataset to:

  • Google Sheets for regulatory watchlists.
  • Snowflake or BigQuery for pharma analytics.
  • CRM enrichment pipelines for sponsor intelligence.
  • Internal dashboards that monitor generic-entry opportunities.
  • Apify webhooks for scheduled portfolio updates.

API usage with Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/fda-orange-book-scraper').call({
queries: ['aspirin'],
searchField: 'brand',
maxItems: 100
});
console.log(run.defaultDatasetId);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/fda-orange-book-scraper').call(run_input={
'queries': ['aspirin'],
'searchField': 'brand',
'maxItems': 100,
})
print(run['defaultDatasetId'])

API usage with cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~fda-orange-book-scraper/runs?token=YOUR_APIFY_TOKEN' \
-H 'Content-Type: application/json' \
-d '{"queries":["aspirin"],"searchField":"brand","maxItems":100}'

MCP integration

Use Apify MCP to call this scraper from Claude Desktop, Claude Code, or other MCP clients.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/fda-orange-book-scraper

Claude Code setup:

$claude mcp add apify-fda-orange-book "https://mcp.apify.com/?tools=automation-lab/fda-orange-book-scraper"

Claude Desktop JSON config:

{
"mcpServers": {
"apify-fda-orange-book": {
"url": "https://mcp.apify.com/?tools=automation-lab/fda-orange-book-scraper"
}
}
}

Example prompts:

  • "Export FDA Orange Book records for ibuprofen and summarize the sponsors."
  • "Find Drugs@FDA applications for sponsor PFIZER and group by active ingredient."
  • "Run an application-number lookup for NDA020639 and return the product strengths."

Scheduling

For monitoring workflows, schedule the actor daily, weekly, or monthly.

Common schedules include:

  • Weekly sponsor monitoring.
  • Monthly ingredient portfolio exports.
  • Quarterly regulatory database refreshes.

Data quality notes

The actor reports the data returned by openFDA Drugs@FDA.

It does not provide medical advice.

Always verify regulatory decisions against official FDA systems and primary records.

Legality and responsible use

This actor uses public FDA/openFDA data.

You are responsible for how you use exported data, including compliance with your organization’s regulatory, medical, and legal review processes.

FAQ and troubleshooting

Why did my search return no rows?

Try a different search mode. For example, use ingredient for active ingredients and application_number for NDA/ANDA/BLA identifiers.

Why are patent arrays empty?

The MVP uses the reliable openFDA Drugs@FDA API. Patent/exclusivity download pages may be unavailable or blocked from automated environments, so the actor marks those fields unavailable when the source does not provide them.

How do I get the original FDA JSON?

Set includeRawRecord to true.

Other Automation Lab actors that can support healthcare and regulatory workflows:

Changelog

Initial version:

  • Public openFDA Drugs@FDA search.
  • Brand, generic, ingredient, sponsor, application-number, and raw query modes.
  • Application, product, submission, and openFDA identifier fields.

Support

If you need a missing field, include an example application number and describe the workflow you are trying to automate.

Final note

FDA Orange Book Scraper is designed for practical, repeatable exports, not one-off manual lookups.

Use it whenever your team needs FDA drug application data in a dataset, scheduled job, or API pipeline.