Pricing

Pay per event

Viator & GetYourGuide Tours Scraper

Scrape and unify tour and activity prices from Viator and GetYourGuide into one normalized schema — prices, duration, ratings, review counts, booking URLs per activity — export to JSON or CSV. A Viator / GetYourGuide API alternative for tour operators and OTA analysts.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Viator & GetYourGuide Scraper

We do the dirty work so your dataset stays clean. 😈

$3.05 / 1,000 activity rows — pay only for results that land. No credit card to try.

Viator and GetYourGuide list overlapping inventory with different SKUs and prices. Running two single-source scrapers and writing your own normalization layer costs 1–2 dev-weeks and still leaves you with mismatched schemas. This Actor hits both platforms in one run, absorbs the blocks and retries, and emits one flat ResultRow per activity — platform, price, rating, duration, booking URL — straight into your Apify dataset.

One run. One schema. Ready for your spreadsheet, BI tool, or warehouse.

🎯 What this scrapes

Two major tours-and-activities platforms, unified into one Pydantic-validated schema:

Viator — viator.com/searchResults/all?text=<query> (server-rendered HTML, 24 cards per page, data-automation test attributes)
GetYourGuide — getyourguide.com/s/?q=<query> (Vue.js shell with SSR card content, 24 cards per page)
Klook (v2 upgrade — currently returns 0 rows) — klook.com/search/?keyword=<query> is gated by a JS challenge that requires full browser execution. Documented as a future upgrade behind Camoufox; v1 returns [] with a WARNING.

Output rows carry every field needed for cross-platform comparison:

Field	Type	Description
`platform`	string	Platform literal (`viator`, `getyourguide`, `klook`)
`activity_id`	string	Platform-canonical activity ID
`activity_title`	string	Card title text
`location_query`	string	Echo of the user input — useful when batching cities
`location_city`	string \| null	Best-effort city parsed from URL or query
`location_country`	string \| null	Best-effort country
`price_usd`	number \| null	USD price when the platform itself displays USD
`currency_original`	string \| null	ISO 4217 code parsed from the price symbol
`price_original`	number \| null	Numeric price in the displayed currency
`duration_hours`	number \| null	Activity duration in hours; midpoint for ranges
`rating`	number \| null	Star rating, 0–5 scale
`review_count`	integer \| null	Number of reviews
`operator_name`	string \| null	Tour operator/supplier when surfaced
`category`	string \| null	Card tag (`tour`, `experience`, ...)
`booking_url`	string	Absolute URL to the activity page
`image_url`	string \| null	Absolute URL to the primary thumbnail
`scraped_at`	string	ISO 8601 UTC timestamp

🔥 Features

Two platforms, one schema — drop the dataset straight into a spreadsheet or BI tool; no per-platform normalization required.
We rotate browser fingerprints — curl-cffi impersonates Chrome 131 / Chrome 124 / Firefox 147 at the TLS+HTTP/2 layer, so both platforms see real-browser traffic, not Python.
We retry with exponential backoff — 408 / 429 / 503 responses trigger up to 5 attempts with doubling delays; Retry-After headers are honoured.
We rotate residential proxies — BUYPROXIES94952 routing is on by default; a fresh session_id and fresh exit IP are issued on every block.
Per-platform isolation — one platform's failure does not abort the run; surviving platforms still produce data.
Currency-aware — symbol-to-ISO mapping (€/$/£/¥ → EUR/USD/GBP/JPY); price_usd is populated only when the platform itself displays USD.
Duration parser handles ranges — "5 to 9 hours" → 7.0 (midpoint); "30 minutes" → 0.5; "1 day" → 24.0.
Pydantic v2 validation — input and every output row are model-validated; invalid input fails fast with a clear error before any network call.
Clean dataset rows — ISO-8601 timestamps, stable platform IDs, no half-parsed strings.
Configurable cap — maxPerPlatform lets you cap each platform at 1–100 rows per run.

💡 Use cases

Tour operator competitive intelligence — find every activity your competitors list in your destination, compare prices, ratings, and durations side-by-side.
OTA cross-platform analyst dashboards — feed a BI tool with snapshots of how Viator and GetYourGuide each rank the same destination.
Dynamic pricing strategy — track how the same activity type is priced on each platform over time and adjust your own listings accordingly.
Destination intelligence reports — schedule weekly runs for "Paris" or "Tokyo" into a named dataset and chart price drift.
Travel-blogger affiliate research — surface high-rating, high-review-count activities for destination guides without manual browsing.
Inbound-tour-builder market research — discover which experiences dominate the first-page results when entering a new destination.
Travel-tech investor diligence — benchmark the top-of-funnel pricing across the experience-booking layer of the travel stack.

⚙️ How to use it

Open the Actor input form.
Type a Destination (Paris, New York, Tokyo, Bali, …).
(Optional) Pick Platforms — leave empty to scrape all three, or list a subset like ["viator"] or ["getyourguide"]. Klook returns 0 rows in v1 (documented limitation).
(Optional) Set Max rows per platform — default 20, max 100.
Leave Use Apify Proxy on (default) for cleaner exit IPs when platforms throttle datacenter traffic.
Click Start. Results stream into the default dataset.

Quick examples

Both supported platforms, default cap:

{
  "locationQuery": "Paris"
}

GetYourGuide only, 5 rows (fastest path to confirm output shape):

{
  "locationQuery": "Paris",
  "platforms": ["getyourguide"],
  "maxPerPlatform": 5,
  "useProxy": false
}

Viator only, 50 rows:

{
  "locationQuery": "Tokyo",
  "platforms": ["viator"],
  "maxPerPlatform": 50,
  "useProxy": false
}

📥 Input

JSON key	Type	Default	Description
`locationQuery`	string	(required)	Destination text query (e.g. `"Paris"`)
`platforms`	array of literal	`[]` (= all 3)	Subset of `viator` / `getyourguide` / `klook`
`maxPerPlatform`	integer	`20`	Cap on rows per platform (1–100)
`useProxy`	boolean	`true`	Route via Apify Proxy `BUYPROXIES94952`

locationQuery is the only required field. Whitespace is stripped; blank values are rejected up-front by Pydantic before any network call is made.

📤 Output

One row per activity. See the What this scrapes table above for the full schema.

{
  "platform": "getyourguide",
  "activity_id": "508441",
  "activity_title": "Paris: Le Marais Guided Food Tour with Tastings",
  "location_query": "Paris",
  "location_city": "Paris",
  "location_country": null,
  "price_usd": null,
  "currency_original": "EUR",
  "price_original": 69.0,
  "duration_hours": 3.0,
  "rating": 4.9,
  "review_count": 506,
  "operator_name": null,
  "category": "experience",
  "booking_url": "https://www.getyourguide.com/paris-l16/no-diet-club-unique-local-food-tour-in-paris-le-marais-t508441/",
  "image_url": "https://cdn.getyourguide.com/image/.../tour_img/7b9edf635985a601.jpeg",
  "scraped_at": "2026-05-16T22:00:00.000Z"
}

💰 Pricing

Pay-Per-Event (PPE) — you pay only for results that land:

Event	Rate	Trigger
`actor-start`	$0.05	Once per run at Actor boot
`result-row`	$0.003	Per activity row emitted

Typical run cost (default maxPerPlatform=20, 2 working platforms, ~40 rows): ~$0.17. Per 1,000 rows extrapolated: ~$3.05.

No results → no charge beyond the $0.05 start event. No subscription, no seat fee.

🚧 Limitations

Klook returns 0 rows in v1 — the search endpoint is gated by a JS challenge that curl-cffi cannot clear without a full browser. We document this up front and ship without it rather than over-promise; v2 will add a Camoufox path behind a feature flag.
First page only — no pagination across multiple result pages. Each platform returns ~20–24 cards on the first page; the default cap is 20.
No detail-page scraping — we scrape the search-results surface only. Itineraries, photo galleries, availability calendars, and meeting points are out of scope for v1.
Currency follows the platform's display — price_usd is populated only when the platform itself displays USD. We do not run our own FX conversion.
Search relevance is the platform's — "Paris" can include cards for Versailles or nearby destinations, depending on each platform's relevance engine.
Apify free-tier residential proxy is limited — BUYPROXIES94952 is the only proxy group provisioned on this account; works for our scale.

❓ FAQ

Q: Does Viator or GetYourGuide offer an official API I can use instead? A: Viator and GetYourGuide do publish partner APIs, but they require approved partner status, commercial agreements, and ongoing approval processes that most independent developers and analysts cannot access. This Actor scrapes the public search-results pages — no partner relationship needed. The output schema is compatible with what a partner API would return for the same fields.

Q: Why is Klook in the schema but returns 0 rows? A: Klook gates every meaningful endpoint behind a JS challenge that curl-cffi cannot clear without a full browser. Adding Klook v2 requires Camoufox, which costs roughly 10× the compute of HTTP scraping. We kept the platform literal in the schema so v2 can land without breaking the dataset shape — but for v1, every Klook call returns [] with a WARNING. Use platforms: ["viator", "getyourguide"] to skip the wasted call entirely.

Q: How is duration_hours parsed? A: Viator and GetYourGuide write durations in several formats: "3 hours", "1 hour", "30 minutes", "5 to 9 hours", "1 day", "2.5 hours". We parse all of them. For ranges, we use the midpoint ("5 to 9 hours" → 7.0). Anything unparseable stays null rather than crashing the row.

Q: Can I track an activity's price across multiple runs? A: Yes — each run is independent. To track an activity over time, schedule periodic runs and write to a named dataset (Actor.open_dataset(name=...)) or export to your warehouse. The Apify default dataset retention is 7 days; a named dataset persists until you delete it.

Q: Can I batch multiple cities in one run? A: Not in v1 — one locationQuery per run. To batch, schedule one Actor task per city (Apify supports unlimited parallel tasks on the free tier up to the concurrent-run cap). Each result row carries location_query so downstream pivots stay correct.

Q: Why default useProxy: true? A: Both platforms run behind edge protection and occasionally throttle datacenter IP ranges. The default-on posture trades a small latency overhead for materially higher first-page success rates. If you are running from a clean residential network, you can set it to false.

Q: Why no detail-page scraping? A: Each detail page is a heavier scrape (cancellation policy, photos, availability) and is behind additional edge protection. v1 ships the breadth-first surface-level price intel that 80% of buyers actually need; detail-page scraping is on the v2 roadmap.

💬 Your feedback

Found a parser glitch, a missing platform, or a field that's broken? Open an issue on the Actor's Apify Store page. We read every report — the QA fixture for Paris keeps regressions locked, but real-world destination edge cases always surface new patterns.

GetYourGuide, Viator, Klook & Tiqets Tours Scraper

memo23/tours-activities-scraper

Compare tour prices and reviews across GetYourGuide, Viator, Klook and Tiqets from one scraper. Pull each tour's price, currency, rating, full reviews, duration and location by destination or URL — built for travel price intelligence, market research and OTA monitoring. JSON or CSV

Muhamed Didovic

Getyourguide Tours Search Scraper

stealth_mode/getyourguide-tours-search-scraper

Scrape structured tour and activity listings from GetYourGuide.com — including titles, prices, reviews, photos, categories, and availability. Perfect for travel aggregators, market researchers, and tour operators benchmarking the competition.

Stealth mode

Viator Scraper

piotrv1001/viator-scraper

The Viator Scraper extracts activity and tour data from Viator by processing an array of URLs, capturing paginated results with titles, prices, discounts, location details, images, and direct links—ideal for travel research and competitor analysis.

FalconScrape

182

5.0

Viator Tours Scraper

moving_beacon-owner1/viator-tours-scraper

Viator Tours Scraper extracts tour listings from Viator search results, collecting prices, ratings, reviews, durations, cancellation details, product links, and images. It supports lazy-loaded pages, optional pagination, and exports structured tour data to the Apify dataset.

Jamshaid Arif

GetYourGuide Listings Scraper

piotrv1001/getyourguide-listings-scraper

The GetYourGuide Scraper extracts tour and activity listings from GetYourGuide.com — capturing names, prices, ratings, review counts, sample reviews, descriptions, images, and language alternates across 45+ locales — ideal for travel market research, price tracking, and competitive analysis.

FalconScrape

Getyourguide Reviews Scraper

stealth_mode/getyourguide-reviews-scraper

Scrape detailed tour reviews from GetYourGuide.com by Tour ID. Collect ratings, author info, traveler types, media, replies, and 20+ fields per review — perfect for tour operators, travel analysts, and reputation managers.

Stealth mode

Viator Tour Search Scraper

stealth_mode/viator-tour-search-scraper

Scrape comprehensive tour and activity listings from Viator.com, the world's largest travel experiences marketplace. Extract tour details, pricing, reviews, itineraries, photos, and availability data from search results. Ideal for travel agencies, tour operators, price comparison platforms.

Stealth mode

Viator Scraper – Tours, Activities, Prices & Availability

abotapi/viator-com-scraper

Collect Viator.com tour and activity results from destinations, categories, search pages, or pasted URLs. Returns product codes, prices, ratings, reviews, images, availability, source context, and optional detail fields.

Abot API

GetYourGuide Reviews Scraper

solidcode/getyourguide-reviews

[💰 $2.50 / 1K] Extract reviews from GetYourGuide activity pages — ratings, review text, author details, photos, owner responses, and activity metadata.