EMA Medicines Scraper ๐
Pricing
Pay per usage
EMA Medicines Scraper ๐
Scrape European Medicines Agency data for drug approvals, clinical trials & pharmaceutical information. Extract EMA medicines, regulatory documents & authorization data at scale. Perfect for pharma research & compliance.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Shahid Irfan
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
European Medicines Agency Medicines Scraper
Extract structured medicines data from the European Medicines Agency in a format ready for analysis and monitoring. Collect medicine status, authorisation timelines, therapeutic information, product numbers, and official medicine URLs in one run. Built for regulatory research, portfolio tracking, and automated reporting workflows.
Features
- EMA medicines coverage โ Collect records for human and veterinary medicines published by EMA.
- Keyword filtering โ Narrow output using medicine names, INN/common names, status terms, or therapeutic keywords.
- URL-aware input โ Accept EMA search URLs, medicine detail URLs, or direct JSON report URLs.
- Pagination control โ Limit extraction with
results_wantedandmax_pagesfor predictable run sizes. - Clean dataset output โ Excludes null and empty values from each dataset item.
Use Cases
Regulatory Intelligence
Track medicine status changes, approval timelines, and updates for compliance and policy teams.
Portfolio Monitoring
Monitor medicine records by product name, INN, and therapeutic area for internal reporting.
Research Pipelines
Feed downstream BI tools and custom analytics with structured, machine-readable medicine records.
Competitive Benchmarking
Compare categories, authorisation timelines, and medicine status patterns across products.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url | String | No | https://www.ema.europa.eu/en/search?search_api_fulltext=nuwiq | Optional EMA URL (search URL, medicine page URL, or JSON report URL). |
keyword | String | No | nuwiq | Optional keyword filter. If provided, this takes priority over URL-derived keyword values. |
results_wanted | Integer | No | 20 | Maximum number of records to return. |
max_pages | Integer | No | 5 | Page cap used for slicing (20 records per page). |
proxyConfiguration | Object | No | { "useApifyProxy": false } | Proxy settings for restricted environments. |
Output Data
Each item in the dataset can include the following fields:
| Field | Type | Description |
|---|---|---|
name_of_medicine | String | Medicine name |
category | String | Human or Veterinary |
medicine_status | String | Current medicine status |
international_non_proprietary_name_common_name | String | INN/common name |
therapeutic_area_mesh | String | Therapeutic area |
marketing_authorisation_date | String | Marketing authorisation date |
last_updated_date | String | Last update date |
ema_product_number | String | EMA product number |
medicine_url | String | Official EMA medicine page |
source_api_url | String | Source feed URL used during extraction |
Usage Examples
Basic run with defaults
{}
Keyword-based extraction
{"keyword": "breast neoplasms","results_wanted": 30,"max_pages": 3}
Start from an EMA search URL
{"url": "https://www.ema.europa.eu/en/search?search_api_fulltext=nuwiq&page=0","results_wanted": 10}
Specific medicine page URL
{"url": "https://www.ema.europa.eu/en/medicines/human/EPAR/nuwiq"}
Sample Output
{"category": "Human","name_of_medicine": "Nuwiq","ema_product_number": "EMEA/H/C/002813","medicine_status": "Authorised","international_non_proprietary_name_common_name": "simoctocog alfa","therapeutic_area_mesh": "Hemophilia A","marketing_authorisation_date": "22/07/2014","last_updated_date": "21/05/2026","medicine_url": "https://www.ema.europa.eu/en/medicines/human/EPAR/nuwiq","source_api_url": "https://www.ema.europa.eu/en/documents/report/medicines-output-medicines_json-report_en.json"}
Tips for Best Results
Use keyword for focused extraction
Use specific medicine or therapeutic terms to reduce dataset size and improve relevance.
Control run size with limits
Start with results_wanted: 20 to validate output quickly, then increase for production runs.
Prefer explicit URLs when needed
Use a medicine page URL to target one medicine or a search URL to carry query context.
Integrations
Connect extracted data with:
- Google Sheets โ Build tracking sheets
- Airtable โ Create searchable medicine databases
- Make โ Automate medicine monitoring workflows
- Zapier โ Trigger downstream notifications and actions
- Webhooks โ Push results to internal systems
Export formats available from dataset:
- JSON
- CSV
- Excel
- XML
Frequently Asked Questions
Does user input override defaults?
Yes. Values provided in the run input always override schema prefills and local defaults.
What if both url and keyword are provided?
keyword is used as the primary text filter. URL values still help with page offset and URL-specific targeting.
Can I run without any input?
Yes. Defaults are provided for QA and quick start, and produce non-empty output.