UNESCO World Heritage Sites List Scraper avatar

UNESCO World Heritage Sites List Scraper

Pricing

Pay per event

Go to Apify Store
UNESCO World Heritage Sites List Scraper

UNESCO World Heritage Sites List Scraper

Scrapes the complete UNESCO World Heritage List — all inscribed and tentative sites with geo-coordinates, cultural/natural category, inscription criteria, danger status, area, and states parties. Data sourced from the official UNESCO World Heritage Centre.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

9 days ago

Last modified

Categories

Share

Scrape the complete UNESCO World Heritage List — all ~1,250 inscribed sites with geo-coordinates, cultural/natural/mixed category, inscription criteria, danger status, and states parties. Data sourced directly from the official UNESCO World Heritage Centre XML feed.


UNESCO World Heritage Sites List Scraper Features

  • Returns all 1,200+ inscribed sites from the official UNESCO XML feed in a single run
  • Optional: scrape the tentative list (~1,700 candidate sites) or both lists combined
  • Extracts 19 fields per site: name, category, region, lat/lon, inscription year, criteria codes, danger status, area, and more
  • Criteria codes returned in standard UNESCO format: (i),(iii),(vi) — ready for filtering or display
  • Danger status and year included — distinguishes currently-at-risk heritage from safe inscriptions
  • Direct URLs to each site's UNESCO detail page and primary image
  • States parties returned as a comma-separated list — easy to filter by country
  • Pay-per-record pricing: $0.10 per run + $0.002 per record

Who Uses UNESCO World Heritage Data?

  • Travel content creators — Build destination guides, itineraries, and "top heritage sites by region" lists with verified UNESCO data
  • Tourism boards and DMOs — Access authoritative site metadata, categories, and danger status for official travel marketing
  • Education platforms — Populate lesson plans, quizzes, and encyclopedias with up-to-date heritage site information
  • LLM and RAG knowledge bases — Ingest the full World Heritage List as a structured dataset for AI assistants and retrieval systems
  • Researchers and NGOs — Analyze distribution of heritage sites by country, region, category, or danger status
  • App developers — Build interactive heritage maps, country-by-country explorers, or cultural travel apps

How UNESCO World Heritage Sites List Scraper Works

  1. Select a list type: inscribed (default), tentative, or all.
  2. The actor obtains a valid session for the UNESCO World Heritage Centre website, which is protected by Cloudflare.
  3. The official XML feed is fetched — a single 2.4 MB document containing all site records.
  4. Each <row> element is parsed into a structured record with all 19 fields.
  5. Records stream into the Apify dataset. A full inscribed-list run (~1,250 sites) finishes in under two minutes.

Input

{
"maxItems": 100,
"listType": "inscribed"
}
FieldTypeDefaultDescription
maxItemsinteger0 (all)Maximum number of records to return. Set to 0 or omit for all sites.
listTypestringinscribedWhich list to scrape: inscribed (~1,250 sites), tentative (~1,700 candidate sites), or all (both combined).
proxyConfigurationobjectApify residentialProxy settings. The actor requires residential proxies to pass the site's Cloudflare protection. Leave as default unless you have a specific proxy requirement.

Output

Each record in the dataset represents one UNESCO World Heritage Site.

{
"site_id": 90,
"name": "Abu Mena",
"name_local": null,
"category": "Cultural",
"short_description": "<p>The church, baptistry, basilicas, public buildings, streets...</p>",
"states_parties": "Egypt",
"region": "Arab States",
"latitude": 30.8358333333,
"longitude": 29.66666667,
"date_inscribed": 1979,
"criteria": "(iv)",
"in_danger": true,
"danger_listed_year": 2001,
"area_hectares": null,
"extension": false,
"transboundary": false,
"detail_url": "https://whc.unesco.org/en/list/90",
"image_url": "https://whc.unesco.org/uploads/sites/site_90.jpg",
"source_id": "90"
}
FieldTypeDescription
site_idintegerUnique numeric site identifier assigned by UNESCO
namestringSite name in English
name_localstringSite name in the local/official language (null if not published in XML)
categorystringSite category: Cultural, Natural, or Mixed
short_descriptionstringOfficial UNESCO short description (may contain HTML)
states_partiesstringComma-separated list of countries (full names, e.g. France,Spain)
regionstringUNESCO world region (e.g. Europe and North America, Asia and the Pacific)
latitudenumberSite latitude in decimal degrees (primary component for transnational sites)
longitudenumberSite longitude in decimal degrees
date_inscribedintegerYear the site was inscribed on the World Heritage List
criteriastringInscription criteria as a comma-separated string, e.g. (i),(iii),(vi)
in_dangerbooleantrue if the site is on the List of World Heritage in Danger
danger_listed_yearintegerYear the site was added to the danger list (null if not in danger)
area_hectaresnumberTotal inscribed area in hectares (null — not published in the XML feed)
extensionbooleantrue if this inscription was an extension of an earlier site
transboundarybooleantrue if the site spans multiple countries
detail_urlstringFull URL to the site's page on the UNESCO World Heritage Centre website
image_urlstringURL of the primary site image on the UNESCO website
source_idstringRaw source identifier from the XML feed

Notes

  • area_hectares is null for all records — the official XML feed does not include area data. Area figures are available on per-site detail pages.
  • name_local is null for all inscribed-list records — the inscribed XML does not include local-language names.
  • For tentative-list sites, date_inscribed holds the submission year, not an inscription year (these sites are not yet inscribed).
  • Latitude/longitude values for transnational sites with multiple geographic components reflect the first component's coordinates.