Disease Outbreak News Scraper
Pricing
from $0.01 / 1,000 scraped outbreaks
Disease Outbreak News Scraper
Scrape official disease outbreak news from WHO, ECDC, CDC, PAHO, and CAHEC. Export titles, dates, diseases, locations, severity, source URLs, WHO report sections, and attachment links.
Pricing
from $0.01 / 1,000 scraped outbreaks
Rating
0.0
(0)
Developer
Maxime Dupré
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
🦠 Disease outbreak news scraper for official health alerts
Disease Outbreak News Scraper collects official disease outbreak news, public-health alerts, and disease-surveillance bulletins from trusted public sources: WHO Disease Outbreak News, ECDC, CDC, PAHO, and CAHEC.
Use it when you need a clean dataset of source-backed outbreak items for monitoring, dashboards, research, news workflows, travel-risk review, or public-health analysis. Each item keeps the official source URL so you can audit the data against the publisher.
The actor does not need a source login, cookies, or a source API key. Choose source families, add optional official URLs, then narrow results by disease keywords, region, severity, and publication dates.
✅ What this actor does
This actor gathers one dataset item per accepted official outbreak news item or surveillance bulletin.
It covers:
- WHO Disease Outbreak News and WHO outbreak/news items
- ECDC epidemiological updates and outbreak-related update pages
- CDC Travel Health Notices and outbreak updates
- PAHO public-health news and alerts
- CAHEC animal disease surveillance items from the China Animal Health and Epidemiology Center
- Optional official outbreak, alert, feed, archive, or bulletin URLs that produce the same output shape
WHO items can include full source-native report sections such as overview, epidemiology, assessment, advice, response, and further information. Other sources return the official source title, source URL, categories, dates when available, and source-backed facts that can be extracted without inventing data.
📊 What data you get
Each dataset item can include:
- Official source name, publisher, source ID, and source URL
- Title or headline
- Publication and update dates when the source provides them
- Source summary or lead text
- Diseases or outbreak subjects
- Affected places and broad regions when source-backed
- Species for animal-health surveillance items when available
- Severity as
High,Moderate,Low, ornull - Source-native categories such as Disease Outbreak News, Travel Health Notice, or Animal disease surveillance
- WHO report sections and general body text when available
- Source-hosted attachment links when relevant
The actor leaves fields empty or null when the official source does not provide a fact. It does not fabricate diseases, countries, regions, species, severity, dates, or summaries.
🔎 Input options
Start with the default source choices or select the official sources you want:
sourceFamilies- WHO DON, WHO RSS/news, ECDC, CDC, PAHO, and CAHECstartUrls- optional official outbreak, alert, archive, feed, or bulletin URLsdiseaseKeywords- terms such asmpox,cholera,Ebola, or country namesregions- broad WHO or geographic regions when source-backedseverity- keep onlyHigh,Moderate, orLowitemspublishedFromandpublishedTo- date range filtersmaxItems- maximum items to save in the run
Example input:
{"sourceFamilies": ["who-don", "ecdc", "cdc", "paho", "cahec"],"diseaseKeywords": ["mpox", "cholera"],"regions": ["Africa", "Americas"],"severity": "Moderate","maxItems": 100}
📤 Output example
{"source": {"name": "WHO Disease Outbreak News","publisher": "World Health Organization","id": "2026-DON609","url": "https://www.who.int/emergencies/disease-outbreak-news/item/2026-DON609"},"title": "Nipah virus disease - India","publishedAt": "2026-06-25T18:00:00.000Z","updatedAt": "2026-06-25T16:23:05.000Z","summary": "On 11 June 2026, the Kerala State Health Department confirmed one laboratory confirmed case of Nipah virus infection.","diseases": ["Nipah virus disease"],"locations": ["India"],"regions": ["South-East Asia"],"species": [],"severity": "Moderate","categories": ["Disease Outbreak News"],"details": {"overview": "Source-native overview text when available.","epidemiology": "Source-native epidemiology text when available.","assessment": "Source-native risk assessment text when available.","advice": "Source-native public health advice text when available.","response": null,"furtherInformation": null,"bodyText": null},"attachments": []}
💡 Use cases
- Monitor official outbreak news on a schedule
- Feed public-health dashboards with auditable source URLs
- Track disease outbreak alerts by disease, region, or severity
- Build research datasets from WHO Disease Outbreak News reports
- Watch CDC, ECDC, PAHO, and CAHEC surfaces without separate scrapers
- Compare human-health and animal-health surveillance signals in one dataset
💰 Pricing
This actor uses pay-per-event pricing. You are charged once for each accepted official disease outbreak news item or disease-surveillance bulletin saved to the dataset.
Failed source requests, discarded unrelated pages, empty runs, and no-result outcomes are not charged as scraped outbreak items.
⚠️ Limits and caveats
The actor only emits accepted official outbreak, public-health alert, or disease-surveillance items. It skips unrelated CAHEC procurement, administrative, rental, and generic center-news pages when they do not contain disease-surveillance content.
Some sources provide less detail than WHO Disease Outbreak News. For example, ECDC and CAHEC list pages may provide a trusted title and source URL while longer body sections remain null. This keeps the dataset source-backed instead of filling gaps with guesses.
Severity is source-backed or mechanically derived from source text and source categories. If there is not enough source evidence, severity is null.
❓ FAQ
✅ Does this use official sources only?
Yes. The actor is built for official WHO, ECDC, CDC, PAHO, CAHEC, and user-provided official outbreak/news URLs.
🔐 Do I need a source account or API key?
No. The supported sources are public. You do not need to provide source credentials, cookies, or API keys.
📁 Can I export the results?
Yes. Results are stored in the default Apify dataset, so you can export them as JSON, CSV, Excel, XML, or HTML, or use them through the Apify API and integrations.
🧾 Why are some fields null?
Fields are null when the official source does not provide that fact or when the actor cannot extract it without guessing. This is intentional for source-backed public-health data.
🔗 Can I add custom URLs?
Yes. Add official outbreak, alert, archive, feed, or bulletin URLs in startUrls. The actor keeps only pages that match the same outbreak-news or surveillance output contract.
📝 Changelog
- 0.1: Initial release.
🆘 Support
For issues, questions, or feature requests, file a ticket and I'll fix or implement it in less than 24h 🫡
🔗 Other actors
- World Bank Projects Scraper ↗ - Scrape World Bank project and indicator data for public-sector research.
- Website URL Crawler ↗ - Crawl websites and extract URLs for monitoring, inventories, and audits.
- Sitemap Sniffer ↗ - Find public sitemap files and URL inventories from domains or websites.
- Schema Markup Validator ↗ - Check public pages for structured data, metadata, and rich-result readiness.
- GLEIF LEI Lookup ↗ - Look up official legal entity data for KYB, compliance, and research workflows.
Made with ❤️ by Maxime Dupré