WHO Health Intelligence Scraper avatar
WHO Health Intelligence Scraper

Pricing

from $0.01 / 1,000 results

Go to Apify Store
WHO Health Intelligence Scraper

WHO Health Intelligence Scraper

Aggregates WHO health data from multiple APIs: Publications (18K+ documents with PDF text extraction), GHO Statistics, and ClinicalTrials.gov. Features NLP location detection and alert keyword matching.

Pricing

from $0.01 / 1,000 results

Rating

5.0

(3)

Developer

The_Rook

The_Rook

Maintained by Community

Actor stats

3

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

πŸ₯ WHO Health Intelligence Gateway

Apify Actor

Automation AI Ready

Publications Indicators Trials

The all-in-one intelligence tool for Global Health Monitoring.

This Actor unifies the three pillars of health data into a single, automated feed. It does the work of a data analyst in minutes:

  1. πŸ“Š The Past: Fetches official statistics (GHO API).
  2. πŸ“š The Present: Aggregates WHO Publications & Reports via API (with optional PDF text extraction).
  3. πŸ”¬ The Future: Tracks ongoing Clinical Trials for cures (R&D).

🎯 Perfect For

  • πŸ›οΈ NGOs & Humanitarian Aid – Map active outbreaks to deploy resources efficiently
  • πŸ’Š Pharma & Biotech – Track disease trends + competitor R&D pipelines
  • πŸ“° Data Journalists – Scan 100+ PDFs for keywords instantly
  • πŸ”¬ Health Researchers – Access structured WHO data without API complexity
  • πŸ€– AI Developers – RAG-ready health data for LLM systems
  • πŸ“Š Risk Analysis: Monitor stability and health risks in specific regions for corporate intelligence

βš™οΈ Key Features

πŸ“š WHO Publications API + PDF Text Extraction

Access 18,000+ WHO publications via official API. Optionally downloads PDFs and extracts full text content.

  • Fetches publications by keyword and date range
  • Downloads and parses PDF documents
  • Extracts text from complex PDF layouts

🧠 Free NLP Location Detection (No API Key Required)

Uses embedded Natural Language Processing (compromise.js) to analyze PDF text at zero extra cost.

  • Auto-Location Detection: Finds cities and countries (e.g., "Yemen", "Sudan", "Kenya") inside extracted PDF text automatically
  • Smart Cleaning: Removes duplicates, filters generic regions, and strips quotes/formatting noise

πŸ”” Watchlist Alerts

Define keywords (e.g., Cholera, Level 3, Outbreak). The Actor scans report titles and full PDF text.

  • If a match is found, the ⚠️ ALERTS column in your CSV is flagged
  • Optional: Send instant notifications to Slack or Discord

πŸ“Š Organized Multi-Dataset Output

Data is available in multiple formats for maximum flexibility:

  • Default Dataset: All combined data visible immediately in Output tab
  • 3 Separate Datasets: Publications, GHO Statistics, Clinical Trials (in Storage tab for focused analysis)
  • CSV Export: Optional CSV file combining all data
  • HTML Dashboard: Interactive visualization with emoji icons
  • Key-Value Store: Source-specific JSON files (DATA_GHO, DATA_PUBLICATIONS, DATA_TRIALS)

πŸŽ›οΈ Input Configuration

SectionSettingDescription
πŸ“š PublicationsKeyword FilterSearch term (e.g., "cholera", "mpox")
πŸ“š PublicationsExtract PDF TextEnable this to download PDFs and extract text (includes NLP location detection)
πŸ“Š GHO StatsIndicatorsCodes like WHOSIS_000001 (Life Expectancy), MORT_100 (Mortality)
πŸ“Š GHO StatsYears/Date RangeFilter by specific years or year range (startYear/endYear)
πŸ”” AlertsAlert KeywordsKeywords to flag documents (e.g., "outbreak", "emergency")
πŸ”¬ TrialsTarget DiseaseDisease to search for (e.g., "Malaria", "Cancer")
πŸ”¬ TrialsStatus/PhaseFilter by recruitment status and trial phase

πŸ“€ Output Example

The Actor produces a comprehensive dataset. Here is a simplified preview:

[
{
"type": "Publication",
"title": "Multi-country outbreak of cholera, Report #32",
"date": "11/26/2024",
"url": "https://www.who.int/publications/m/item/...",
"extractedLocations": ["Zimbabwe", "Mozambique", "Ethiopia", "Kenya"],
"alertTriggered": true,
"matchedKeywords": ["cholera", "outbreak"]
},
{
"type": "GHO Indicator",
"title": "WHOSIS_000001 - IND",
"date": "2020",
"value": 69.17521966
},
{
"type": "Clinical Trial",
"title": "Phase 3 Malaria Vaccine Study in Children",
"nctId": "NCT05790889",
"status": "RECRUITING",
"phase": "PHASE2",
"conditions": "Malaria",
"url": "https://clinicaltrials.gov/study/NCT05790889"
}
]

This Actor respects WHO and ClinicalTrials.gov terms of service:


Built with ❀️ and </> by The_Rook β€’ Try it now πŸš€