EMA Medicines Scraper ๐Ÿ’Š avatar

EMA Medicines Scraper ๐Ÿ’Š

Pricing

Pay per usage

Go to Apify Store
EMA Medicines Scraper ๐Ÿ’Š

EMA Medicines Scraper ๐Ÿ’Š

Scrape European Medicines Agency data for drug approvals, clinical trials & pharmaceutical information. Extract EMA medicines, regulatory documents & authorization data at scale. Perfect for pharma research & compliance.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

European Medicines Agency Medicines Scraper

Extract structured medicines data from the European Medicines Agency in a format ready for analysis and monitoring. Collect medicine status, authorisation timelines, therapeutic information, product numbers, and official medicine URLs in one run. Built for regulatory research, portfolio tracking, and automated reporting workflows.

Features

  • EMA medicines coverage โ€” Collect records for human and veterinary medicines published by EMA.
  • Keyword filtering โ€” Narrow output using medicine names, INN/common names, status terms, or therapeutic keywords.
  • URL-aware input โ€” Accept EMA search URLs, medicine detail URLs, or direct JSON report URLs.
  • Pagination control โ€” Limit extraction with results_wanted and max_pages for predictable run sizes.
  • Clean dataset output โ€” Excludes null and empty values from each dataset item.

Use Cases

Regulatory Intelligence

Track medicine status changes, approval timelines, and updates for compliance and policy teams.

Portfolio Monitoring

Monitor medicine records by product name, INN, and therapeutic area for internal reporting.

Research Pipelines

Feed downstream BI tools and custom analytics with structured, machine-readable medicine records.

Competitive Benchmarking

Compare categories, authorisation timelines, and medicine status patterns across products.

Input Parameters

ParameterTypeRequiredDefaultDescription
urlStringNohttps://www.ema.europa.eu/en/search?search_api_fulltext=nuwiqOptional EMA URL (search URL, medicine page URL, or JSON report URL).
keywordStringNonuwiqOptional keyword filter. If provided, this takes priority over URL-derived keyword values.
results_wantedIntegerNo20Maximum number of records to return.
max_pagesIntegerNo5Page cap used for slicing (20 records per page).
proxyConfigurationObjectNo{ "useApifyProxy": false }Proxy settings for restricted environments.

Output Data

Each item in the dataset can include the following fields:

FieldTypeDescription
name_of_medicineStringMedicine name
categoryStringHuman or Veterinary
medicine_statusStringCurrent medicine status
international_non_proprietary_name_common_nameStringINN/common name
therapeutic_area_meshStringTherapeutic area
marketing_authorisation_dateStringMarketing authorisation date
last_updated_dateStringLast update date
ema_product_numberStringEMA product number
medicine_urlStringOfficial EMA medicine page
source_api_urlStringSource feed URL used during extraction

Usage Examples

Basic run with defaults

{}

Keyword-based extraction

{
"keyword": "breast neoplasms",
"results_wanted": 30,
"max_pages": 3
}

Start from an EMA search URL

{
"url": "https://www.ema.europa.eu/en/search?search_api_fulltext=nuwiq&page=0",
"results_wanted": 10
}

Specific medicine page URL

{
"url": "https://www.ema.europa.eu/en/medicines/human/EPAR/nuwiq"
}

Sample Output

{
"category": "Human",
"name_of_medicine": "Nuwiq",
"ema_product_number": "EMEA/H/C/002813",
"medicine_status": "Authorised",
"international_non_proprietary_name_common_name": "simoctocog alfa",
"therapeutic_area_mesh": "Hemophilia A",
"marketing_authorisation_date": "22/07/2014",
"last_updated_date": "21/05/2026",
"medicine_url": "https://www.ema.europa.eu/en/medicines/human/EPAR/nuwiq",
"source_api_url": "https://www.ema.europa.eu/en/documents/report/medicines-output-medicines_json-report_en.json"
}

Tips for Best Results

Use keyword for focused extraction

Use specific medicine or therapeutic terms to reduce dataset size and improve relevance.

Control run size with limits

Start with results_wanted: 20 to validate output quickly, then increase for production runs.

Prefer explicit URLs when needed

Use a medicine page URL to target one medicine or a search URL to carry query context.

Integrations

Connect extracted data with:

  • Google Sheets โ€” Build tracking sheets
  • Airtable โ€” Create searchable medicine databases
  • Make โ€” Automate medicine monitoring workflows
  • Zapier โ€” Trigger downstream notifications and actions
  • Webhooks โ€” Push results to internal systems

Export formats available from dataset:

  • JSON
  • CSV
  • Excel
  • XML

Frequently Asked Questions

Does user input override defaults?

Yes. Values provided in the run input always override schema prefills and local defaults.

What if both url and keyword are provided?

keyword is used as the primary text filter. URL values still help with page offset and URL-specific targeting.

Can I run without any input?

Yes. Defaults are provided for QA and quick start, and produce non-empty output.