EMA Medicines Scraper - European Drug Authorisation Register avatar

EMA Medicines Scraper - European Drug Authorisation Register

Pricing

Pay per event

Go to Apify Store
EMA Medicines Scraper - European Drug Authorisation Register

EMA Medicines Scraper - European Drug Authorisation Register

Extract EU drug authorisation data from the European Medicines Agency (EMA) register. Human and veterinary medicines: active substance, ATC code, MAH, authorisation status, orphan/biosimilar/generic flags, and product URLs. Filter by category, status, therapeutic area, or ATC code.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

EMA Medicines Scraper — European Drug Authorisation Register

Extract the complete European Medicines Agency (EMA) centralised medicines authorisation register. Covers all human and veterinary medicines that have received, or applied for, a centralised EU marketing authorisation.

Data source: EMA nightly XLSX bulk export at ema.europa.eu. Updated every night by the EMA.


What data does it extract?

Each record corresponds to one medicine and includes:

FieldDescription
medicine_nameBrand name
categoryHuman or Veterinary
ema_product_numberEMA product number (e.g. EMEA/H/C/004781)
authorisation_statusAuthorised, Withdrawn, Refused, or Suspended
innInternational Non-proprietary Name / common name
active_substanceActive substance(s)
therapeutic_areaTherapeutic area (MeSH terms)
atc_codeATC code (human) or ATCvet code (veterinary)
pharmacotherapeutic_groupPharmacotherapeutic group
marketing_authorisation_holderMAH company name
first_authorised_dateFirst EU marketing authorisation date (DD/MM/YYYY)
orphan_designationOrphan medicine designation flag
biosimilarBiosimilar flag
generic_or_hybridGeneric or hybrid application flag
conditional_marketing_authorisationConditional approval flag
additional_monitoringAdditional monitoring (black triangle) flag
accelerated_assessmentAccelerated assessment flag
exceptional_circumstancesExceptional circumstances flag
product_urlEMA product page URL

Input options

ParameterTypeDefaultDescription
medicineCategoryStringhumanFilter: human, veterinary, or leave blank for all
authorisationStatusStringAuthorisedFilter: Authorised, Withdrawn, Refused, Suspended, or blank for all
therapeuticAreaString(blank)Filter by therapeutic area substring, case-insensitive (e.g. Diabetes)
atcCodeString(blank)Filter by ATC code prefix (e.g. L01 for antineoplastics)
authorisationDateFromString(blank)Include only medicines authorised on or after this date (YYYY-MM-DD or DD/MM/YYYY)
authorisationDateToString(blank)Include only medicines authorised on or before this date (YYYY-MM-DD or DD/MM/YYYY)
maxItemsInteger15Maximum number of records to return (0 = all)

How it works

The actor downloads EMA's nightly XLSX bulk export (approximately 885 KB, ~2,700 records) using a single HTTP request. No browser automation, no pagination, no proxy required. The XLSX is parsed in-memory using Node.js built-in modules, then filtered and saved to the Apify dataset.

Performance: Typically completes in under 10 seconds.

Memory: 256 MB is sufficient. The actor is configured for 512 MB to be safe.


Example run

Input:

{
"medicineCategory": "human",
"authorisationStatus": "Authorised",
"atcCode": "L01",
"maxItems": 5
}

Sample output record:

{
"medicine_name": "Keytruda",
"category": "Human",
"ema_product_number": "EMEA/H/C/003820",
"authorisation_status": "Authorised",
"inn": "pembrolizumab",
"active_substance": "pembrolizumab",
"therapeutic_area": "Melanoma; Carcinoma, Non-Small-Cell Lung; ...",
"atc_code": "L01FF02",
"pharmacotherapeutic_group": "Antineoplastic agents, monoclonal antibodies",
"marketing_authorisation_holder": "Merck Sharp & Dohme B.V.",
"first_authorised_date": "17/07/2015",
"orphan_designation": false,
"biosimilar": false,
"generic_or_hybrid": false,
"conditional_marketing_authorisation": false,
"additional_monitoring": true,
"accelerated_assessment": false,
"exceptional_circumstances": false,
"product_url": "https://www.ema.europa.eu/en/medicines/human/EPAR/keytruda"
}

Use cases

  • Pharma intelligence: Monitor which medicines have EU authorisation and track MAH portfolios
  • Biotech business development: Identify orphan, biosimilar, or conditionally approved medicines
  • Regulatory consulting: Track EU status of medicines by active substance or therapeutic area
  • Academic research: Build datasets of authorised medicines by ATC code or indication
  • Generics manufacturers: Identify authorised generic/hybrid medicines

Notes

  • Data is updated nightly by the EMA. Each actor run downloads the latest version.
  • The dataset covers approximately 2,700 medicines in the centralised authorisation procedure. Nationally authorised medicines are not included.
  • Withdrawn medicines remain in the dataset with status Withdrawn.