Disease Outbreak News Scraper avatar

Disease Outbreak News Scraper

Pricing

from $0.01 / 1,000 scraped outbreaks

Go to Apify Store
Disease Outbreak News Scraper

Disease Outbreak News Scraper

Scrape official disease outbreak news from WHO, ECDC, CDC, PAHO, and CAHEC. Export titles, dates, diseases, locations, severity, source URLs, WHO report sections, and attachment links.

Pricing

from $0.01 / 1,000 scraped outbreaks

Rating

0.0

(0)

Developer

Maxime Dupré

Maxime Dupré

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

🦠 Disease outbreak news scraper for official health alerts

Disease Outbreak News Scraper collects official disease outbreak news, public-health alerts, and disease-surveillance bulletins from trusted public sources: WHO Disease Outbreak News, ECDC, CDC, PAHO, and CAHEC.

Use it when you need a clean dataset of source-backed outbreak items for monitoring, dashboards, research, news workflows, travel-risk review, or public-health analysis. Each item keeps the official source URL so you can audit the data against the publisher.

The actor does not need a source login, cookies, or a source API key. Choose source families, add optional official URLs, then narrow results by disease keywords, region, severity, and publication dates.

✅ What this actor does

This actor gathers one dataset item per accepted official outbreak news item or surveillance bulletin.

It covers:

  • WHO Disease Outbreak News and WHO outbreak/news items
  • ECDC epidemiological updates and outbreak-related update pages
  • CDC Travel Health Notices and outbreak updates
  • PAHO public-health news and alerts
  • CAHEC animal disease surveillance items from the China Animal Health and Epidemiology Center
  • Optional official outbreak, alert, feed, archive, or bulletin URLs that produce the same output shape

WHO items can include full source-native report sections such as overview, epidemiology, assessment, advice, response, and further information. Other sources return the official source title, source URL, categories, dates when available, and source-backed facts that can be extracted without inventing data.

📊 What data you get

Each dataset item can include:

  • Official source name, publisher, source ID, and source URL
  • Title or headline
  • Publication and update dates when the source provides them
  • Source summary or lead text
  • Diseases or outbreak subjects
  • Affected places and broad regions when source-backed
  • Species for animal-health surveillance items when available
  • Severity as High, Moderate, Low, or null
  • Source-native categories such as Disease Outbreak News, Travel Health Notice, or Animal disease surveillance
  • WHO report sections and general body text when available
  • Source-hosted attachment links when relevant

The actor leaves fields empty or null when the official source does not provide a fact. It does not fabricate diseases, countries, regions, species, severity, dates, or summaries.

🔎 Input options

Start with the default source choices or select the official sources you want:

  • sourceFamilies - WHO DON, WHO RSS/news, ECDC, CDC, PAHO, and CAHEC
  • startUrls - optional official outbreak, alert, archive, feed, or bulletin URLs
  • diseaseKeywords - terms such as mpox, cholera, Ebola, or country names
  • regions - broad WHO or geographic regions when source-backed
  • severity - keep only High, Moderate, or Low items
  • publishedFrom and publishedTo - date range filters
  • maxItems - maximum items to save in the run

Example input:

{
"sourceFamilies": ["who-don", "ecdc", "cdc", "paho", "cahec"],
"diseaseKeywords": ["mpox", "cholera"],
"regions": ["Africa", "Americas"],
"severity": "Moderate",
"maxItems": 100
}

📤 Output example

{
"source": {
"name": "WHO Disease Outbreak News",
"publisher": "World Health Organization",
"id": "2026-DON609",
"url": "https://www.who.int/emergencies/disease-outbreak-news/item/2026-DON609"
},
"title": "Nipah virus disease - India",
"publishedAt": "2026-06-25T18:00:00.000Z",
"updatedAt": "2026-06-25T16:23:05.000Z",
"summary": "On 11 June 2026, the Kerala State Health Department confirmed one laboratory confirmed case of Nipah virus infection.",
"diseases": ["Nipah virus disease"],
"locations": ["India"],
"regions": ["South-East Asia"],
"species": [],
"severity": "Moderate",
"categories": ["Disease Outbreak News"],
"details": {
"overview": "Source-native overview text when available.",
"epidemiology": "Source-native epidemiology text when available.",
"assessment": "Source-native risk assessment text when available.",
"advice": "Source-native public health advice text when available.",
"response": null,
"furtherInformation": null,
"bodyText": null
},
"attachments": []
}

💡 Use cases

  • Monitor official outbreak news on a schedule
  • Feed public-health dashboards with auditable source URLs
  • Track disease outbreak alerts by disease, region, or severity
  • Build research datasets from WHO Disease Outbreak News reports
  • Watch CDC, ECDC, PAHO, and CAHEC surfaces without separate scrapers
  • Compare human-health and animal-health surveillance signals in one dataset

💰 Pricing

This actor uses pay-per-event pricing. You are charged once for each accepted official disease outbreak news item or disease-surveillance bulletin saved to the dataset.

Failed source requests, discarded unrelated pages, empty runs, and no-result outcomes are not charged as scraped outbreak items.

⚠️ Limits and caveats

The actor only emits accepted official outbreak, public-health alert, or disease-surveillance items. It skips unrelated CAHEC procurement, administrative, rental, and generic center-news pages when they do not contain disease-surveillance content.

Some sources provide less detail than WHO Disease Outbreak News. For example, ECDC and CAHEC list pages may provide a trusted title and source URL while longer body sections remain null. This keeps the dataset source-backed instead of filling gaps with guesses.

Severity is source-backed or mechanically derived from source text and source categories. If there is not enough source evidence, severity is null.

❓ FAQ

✅ Does this use official sources only?

Yes. The actor is built for official WHO, ECDC, CDC, PAHO, CAHEC, and user-provided official outbreak/news URLs.

🔐 Do I need a source account or API key?

No. The supported sources are public. You do not need to provide source credentials, cookies, or API keys.

📁 Can I export the results?

Yes. Results are stored in the default Apify dataset, so you can export them as JSON, CSV, Excel, XML, or HTML, or use them through the Apify API and integrations.

🧾 Why are some fields null?

Fields are null when the official source does not provide that fact or when the actor cannot extract it without guessing. This is intentional for source-backed public-health data.

🔗 Can I add custom URLs?

Yes. Add official outbreak, alert, archive, feed, or bulletin URLs in startUrls. The actor keeps only pages that match the same outbreak-news or surveillance output contract.

📝 Changelog

  • 0.1: Initial release.

🆘 Support

For issues, questions, or feature requests, file a ticket and I'll fix or implement it in less than 24h 🫡

🔗 Other actors

Made with ❤️ by Maxime Dupré