UNESCO World Heritage Sites List Scraper
Pricing
Pay per event
UNESCO World Heritage Sites List Scraper
Scrapes the complete UNESCO World Heritage List — all inscribed and tentative sites with geo-coordinates, cultural/natural category, inscription criteria, danger status, area, and states parties. Data sourced from the official UNESCO World Heritage Centre.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 days ago
Last modified
Share
Scrape the complete UNESCO World Heritage List — all ~1,250 inscribed sites with geo-coordinates, cultural/natural/mixed category, inscription criteria, danger status, and states parties. Data sourced directly from the official UNESCO World Heritage Centre XML feed.
UNESCO World Heritage Sites List Scraper Features
- Returns all 1,200+ inscribed sites from the official UNESCO XML feed in a single run
- Optional: scrape the tentative list (~1,700 candidate sites) or both lists combined
- Extracts 19 fields per site: name, category, region, lat/lon, inscription year, criteria codes, danger status, area, and more
- Criteria codes returned in standard UNESCO format:
(i),(iii),(vi)— ready for filtering or display - Danger status and year included — distinguishes currently-at-risk heritage from safe inscriptions
- Direct URLs to each site's UNESCO detail page and primary image
- States parties returned as a comma-separated list — easy to filter by country
- Pay-per-record pricing: $0.10 per run + $0.002 per record
Who Uses UNESCO World Heritage Data?
- Travel content creators — Build destination guides, itineraries, and "top heritage sites by region" lists with verified UNESCO data
- Tourism boards and DMOs — Access authoritative site metadata, categories, and danger status for official travel marketing
- Education platforms — Populate lesson plans, quizzes, and encyclopedias with up-to-date heritage site information
- LLM and RAG knowledge bases — Ingest the full World Heritage List as a structured dataset for AI assistants and retrieval systems
- Researchers and NGOs — Analyze distribution of heritage sites by country, region, category, or danger status
- App developers — Build interactive heritage maps, country-by-country explorers, or cultural travel apps
How UNESCO World Heritage Sites List Scraper Works
- Select a list type:
inscribed(default),tentative, orall. - The actor obtains a valid session for the UNESCO World Heritage Centre website, which is protected by Cloudflare.
- The official XML feed is fetched — a single 2.4 MB document containing all site records.
- Each
<row>element is parsed into a structured record with all 19 fields. - Records stream into the Apify dataset. A full inscribed-list run (~1,250 sites) finishes in under two minutes.
Input
{"maxItems": 100,"listType": "inscribed"}
| Field | Type | Default | Description |
|---|---|---|---|
maxItems | integer | 0 (all) | Maximum number of records to return. Set to 0 or omit for all sites. |
listType | string | inscribed | Which list to scrape: inscribed (~1,250 sites), tentative (~1,700 candidate sites), or all (both combined). |
proxyConfiguration | object | Apify residential | Proxy settings. The actor requires residential proxies to pass the site's Cloudflare protection. Leave as default unless you have a specific proxy requirement. |
Output
Each record in the dataset represents one UNESCO World Heritage Site.
{"site_id": 90,"name": "Abu Mena","name_local": null,"category": "Cultural","short_description": "<p>The church, baptistry, basilicas, public buildings, streets...</p>","states_parties": "Egypt","region": "Arab States","latitude": 30.8358333333,"longitude": 29.66666667,"date_inscribed": 1979,"criteria": "(iv)","in_danger": true,"danger_listed_year": 2001,"area_hectares": null,"extension": false,"transboundary": false,"detail_url": "https://whc.unesco.org/en/list/90","image_url": "https://whc.unesco.org/uploads/sites/site_90.jpg","source_id": "90"}
| Field | Type | Description |
|---|---|---|
site_id | integer | Unique numeric site identifier assigned by UNESCO |
name | string | Site name in English |
name_local | string | Site name in the local/official language (null if not published in XML) |
category | string | Site category: Cultural, Natural, or Mixed |
short_description | string | Official UNESCO short description (may contain HTML) |
states_parties | string | Comma-separated list of countries (full names, e.g. France,Spain) |
region | string | UNESCO world region (e.g. Europe and North America, Asia and the Pacific) |
latitude | number | Site latitude in decimal degrees (primary component for transnational sites) |
longitude | number | Site longitude in decimal degrees |
date_inscribed | integer | Year the site was inscribed on the World Heritage List |
criteria | string | Inscription criteria as a comma-separated string, e.g. (i),(iii),(vi) |
in_danger | boolean | true if the site is on the List of World Heritage in Danger |
danger_listed_year | integer | Year the site was added to the danger list (null if not in danger) |
area_hectares | number | Total inscribed area in hectares (null — not published in the XML feed) |
extension | boolean | true if this inscription was an extension of an earlier site |
transboundary | boolean | true if the site spans multiple countries |
detail_url | string | Full URL to the site's page on the UNESCO World Heritage Centre website |
image_url | string | URL of the primary site image on the UNESCO website |
source_id | string | Raw source identifier from the XML feed |
Notes
area_hectaresisnullfor all records — the official XML feed does not include area data. Area figures are available on per-site detail pages.name_localisnullfor all inscribed-list records — the inscribed XML does not include local-language names.- For tentative-list sites,
date_inscribedholds the submission year, not an inscription year (these sites are not yet inscribed). - Latitude/longitude values for transnational sites with multiple geographic components reflect the first component's coordinates.