WHO Health Intelligence Scraper
Pricing
from $0.01 / 1,000 results
WHO Health Intelligence Scraper
Aggregates WHO health data from multiple APIs: Publications (18K+ documents with PDF text extraction), GHO Statistics, and ClinicalTrials.gov. Features NLP location detection and alert keyword matching.
Pricing
from $0.01 / 1,000 results
Rating
5.0
(3)
Developer

The_Rook
Actor stats
3
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
π₯ WHO Health Intelligence Gateway
The all-in-one intelligence tool for Global Health Monitoring.
This Actor unifies the three pillars of health data into a single, automated feed. It does the work of a data analyst in minutes:
- π The Past: Fetches official statistics (GHO API).
- π The Present: Aggregates WHO Publications & Reports via API (with optional PDF text extraction).
- π¬ The Future: Tracks ongoing Clinical Trials for cures (R&D).
π― Perfect For
- ποΈ NGOs & Humanitarian Aid β Map active outbreaks to deploy resources efficiently
- π Pharma & Biotech β Track disease trends + competitor R&D pipelines
- π° Data Journalists β Scan 100+ PDFs for keywords instantly
- π¬ Health Researchers β Access structured WHO data without API complexity
- π€ AI Developers β RAG-ready health data for LLM systems
- π Risk Analysis: Monitor stability and health risks in specific regions for corporate intelligence
βοΈ Key Features
π WHO Publications API + PDF Text Extraction
Access 18,000+ WHO publications via official API. Optionally downloads PDFs and extracts full text content.
- Fetches publications by keyword and date range
- Downloads and parses PDF documents
- Extracts text from complex PDF layouts
π§ Free NLP Location Detection (No API Key Required)
Uses embedded Natural Language Processing (compromise.js) to analyze PDF text at zero extra cost.
- Auto-Location Detection: Finds cities and countries (e.g., "Yemen", "Sudan", "Kenya") inside extracted PDF text automatically
- Smart Cleaning: Removes duplicates, filters generic regions, and strips quotes/formatting noise
π Watchlist Alerts
Define keywords (e.g., Cholera, Level 3, Outbreak). The Actor scans report titles and full PDF text.
- If a match is found, the β οΈ ALERTS column in your CSV is flagged
- Optional: Send instant notifications to Slack or Discord
π Organized Multi-Dataset Output
Data is available in multiple formats for maximum flexibility:
- Default Dataset: All combined data visible immediately in Output tab
- 3 Separate Datasets: Publications, GHO Statistics, Clinical Trials (in Storage tab for focused analysis)
- CSV Export: Optional CSV file combining all data
- HTML Dashboard: Interactive visualization with emoji icons
- Key-Value Store: Source-specific JSON files (DATA_GHO, DATA_PUBLICATIONS, DATA_TRIALS)
ποΈ Input Configuration
| Section | Setting | Description |
|---|---|---|
| π Publications | Keyword Filter | Search term (e.g., "cholera", "mpox") |
| π Publications | Extract PDF Text | Enable this to download PDFs and extract text (includes NLP location detection) |
| π GHO Stats | Indicators | Codes like WHOSIS_000001 (Life Expectancy), MORT_100 (Mortality) |
| π GHO Stats | Years/Date Range | Filter by specific years or year range (startYear/endYear) |
| π Alerts | Alert Keywords | Keywords to flag documents (e.g., "outbreak", "emergency") |
| π¬ Trials | Target Disease | Disease to search for (e.g., "Malaria", "Cancer") |
| π¬ Trials | Status/Phase | Filter by recruitment status and trial phase |
π€ Output Example
The Actor produces a comprehensive dataset. Here is a simplified preview:
[{"type": "Publication","title": "Multi-country outbreak of cholera, Report #32","date": "11/26/2024","url": "https://www.who.int/publications/m/item/...","extractedLocations": ["Zimbabwe", "Mozambique", "Ethiopia", "Kenya"],"alertTriggered": true,"matchedKeywords": ["cholera", "outbreak"]},{"type": "GHO Indicator","title": "WHOSIS_000001 - IND","date": "2020","value": 69.17521966},{"type": "Clinical Trial","title": "Phase 3 Malaria Vaccine Study in Children","nctId": "NCT05790889","status": "RECRUITING","phase": "PHASE2","conditions": "Malaria","url": "https://clinicaltrials.gov/study/NCT05790889"}]
βοΈ Legal & Compliance
This Actor respects WHO and ClinicalTrials.gov terms of service:
- β Uses official WHO Publications API (https://www.who.int/api/hubs/meetingreports)
- β Uses official WHO GHO OData API (https://ghoapi.azureedge.net/api)
- β Uses official ClinicalTrials.gov API v2 (https://clinicaltrials.gov/api/v2/studies)
- β No web scraping - all data accessed via public APIs
- β PDF downloads respect standard HTTP protocols
Built with β€οΈ and </> by The_Rook β’ Try it now π