EMA Medicines Scraper - European Drug Authorisation Register
Pricing
Pay per event
EMA Medicines Scraper - European Drug Authorisation Register
Extract EU drug authorisation data from the European Medicines Agency (EMA) register. Human and veterinary medicines: active substance, ATC code, MAH, authorisation status, orphan/biosimilar/generic flags, and product URLs. Filter by category, status, therapeutic area, or ATC code.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
EMA Medicines Scraper — European Drug Authorisation Register
Extract the complete European Medicines Agency (EMA) centralised medicines authorisation register. Covers all human and veterinary medicines that have received, or applied for, a centralised EU marketing authorisation.
Data source: EMA nightly XLSX bulk export at ema.europa.eu. Updated every night by the EMA.
What data does it extract?
Each record corresponds to one medicine and includes:
| Field | Description |
|---|---|
medicine_name | Brand name |
category | Human or Veterinary |
ema_product_number | EMA product number (e.g. EMEA/H/C/004781) |
authorisation_status | Authorised, Withdrawn, Refused, or Suspended |
inn | International Non-proprietary Name / common name |
active_substance | Active substance(s) |
therapeutic_area | Therapeutic area (MeSH terms) |
atc_code | ATC code (human) or ATCvet code (veterinary) |
pharmacotherapeutic_group | Pharmacotherapeutic group |
marketing_authorisation_holder | MAH company name |
first_authorised_date | First EU marketing authorisation date (DD/MM/YYYY) |
orphan_designation | Orphan medicine designation flag |
biosimilar | Biosimilar flag |
generic_or_hybrid | Generic or hybrid application flag |
conditional_marketing_authorisation | Conditional approval flag |
additional_monitoring | Additional monitoring (black triangle) flag |
accelerated_assessment | Accelerated assessment flag |
exceptional_circumstances | Exceptional circumstances flag |
product_url | EMA product page URL |
Input options
| Parameter | Type | Default | Description |
|---|---|---|---|
medicineCategory | String | human | Filter: human, veterinary, or leave blank for all |
authorisationStatus | String | Authorised | Filter: Authorised, Withdrawn, Refused, Suspended, or blank for all |
therapeuticArea | String | (blank) | Filter by therapeutic area substring, case-insensitive (e.g. Diabetes) |
atcCode | String | (blank) | Filter by ATC code prefix (e.g. L01 for antineoplastics) |
authorisationDateFrom | String | (blank) | Include only medicines authorised on or after this date (YYYY-MM-DD or DD/MM/YYYY) |
authorisationDateTo | String | (blank) | Include only medicines authorised on or before this date (YYYY-MM-DD or DD/MM/YYYY) |
maxItems | Integer | 15 | Maximum number of records to return (0 = all) |
How it works
The actor downloads EMA's nightly XLSX bulk export (approximately 885 KB, ~2,700 records) using a single HTTP request. No browser automation, no pagination, no proxy required. The XLSX is parsed in-memory using Node.js built-in modules, then filtered and saved to the Apify dataset.
Performance: Typically completes in under 10 seconds.
Memory: 256 MB is sufficient. The actor is configured for 512 MB to be safe.
Example run
Input:
{"medicineCategory": "human","authorisationStatus": "Authorised","atcCode": "L01","maxItems": 5}
Sample output record:
{"medicine_name": "Keytruda","category": "Human","ema_product_number": "EMEA/H/C/003820","authorisation_status": "Authorised","inn": "pembrolizumab","active_substance": "pembrolizumab","therapeutic_area": "Melanoma; Carcinoma, Non-Small-Cell Lung; ...","atc_code": "L01FF02","pharmacotherapeutic_group": "Antineoplastic agents, monoclonal antibodies","marketing_authorisation_holder": "Merck Sharp & Dohme B.V.","first_authorised_date": "17/07/2015","orphan_designation": false,"biosimilar": false,"generic_or_hybrid": false,"conditional_marketing_authorisation": false,"additional_monitoring": true,"accelerated_assessment": false,"exceptional_circumstances": false,"product_url": "https://www.ema.europa.eu/en/medicines/human/EPAR/keytruda"}
Use cases
- Pharma intelligence: Monitor which medicines have EU authorisation and track MAH portfolios
- Biotech business development: Identify orphan, biosimilar, or conditionally approved medicines
- Regulatory consulting: Track EU status of medicines by active substance or therapeutic area
- Academic research: Build datasets of authorised medicines by ATC code or indication
- Generics manufacturers: Identify authorised generic/hybrid medicines
Notes
- Data is updated nightly by the EMA. Each actor run downloads the latest version.
- The dataset covers approximately 2,700 medicines in the centralised authorisation procedure. Nationally authorised medicines are not included.
- Withdrawn medicines remain in the dataset with status
Withdrawn.