All notable changes to the Booking Pro Host Scraper are documented in this file.
The format is based on Keep a Changelog .
Progressive dataset output — Results stream to the Dataset during Phase 2 (no longer wait until end).
crossCountrySeenKeys — Deduplication across countries when pushing progressively in Country mode.
Phase 2 speed — Merged extractRawPageData + extractHotelMetaFromPage into one page.evaluate (~25–40s saved per 500 hotels).
Phase 2 concurrency — Default 12 (was 10), cap 15. perBrowser 150 MB (was 180) for more tabs.
Phase 1 concurrency — 8 parallel city searches (was 5).
Navigation timeout — 15s (was 20s) to fail faster and free slots.
maxRequestRetries — 1 (was 2) for quicker failure on blocked requests.
WAIT_APOLLO_MS — 1000ms (was 1500ms).
antiBlockingDelay — 50ms (was 80ms).
Progress logs — log.info only at milestones (1, 50, 100…). setStatusMessage still every 3 hotels.
Input schema — multiSearch: "Maximum coverage — 18 variations (City & Search URL only)". Removed misleading "Maximum coverage" hint from countries field.
Dataset schema — Removed "Full contact details" view (Apify's "All fields" covers it).
README — Restructured with table of contents, performance table, progressive output note.
maxConcurrency — Input parameter to control Phase 2 concurrency. 0 = auto (scales with memory).
Resource blocking — websocket, manifest, texttrack, eventsource.
Iframe document blocking — All documents from non-booking.com/bstatic.com (maps, ads, widgets).
Domain blocking — Criteo, Outbrain, Taboola, Trustpilot, TripAdvisor, Stripe, PayPal, YouTube, Vimeo, LiveChat, Zendesk, Fullstory, Drift, Intercom, Optimizely, etc.
Phase 1 — 3 parallel searches (instead of 2) for multiSearch and multi-city.
Phase 2 — Default concurrency 8 (instead of 5), cap 10.
Default instance — 4 GB RAM, 1h timeout.
antiBlockingDelay — 150ms (instead of 300ms).
WAIT_APOLLO_MS — 2500ms (instead of 4000ms).
Progress log — Every 3 hotels (instead of 5).
maxConcurrency: 0 — Validation fixed: 0 now accepted for auto mode.
GraphQL extraction + DOM fallback.
Pro hosts filter, host deduplication.
Modes: city name, search URL, hotel URLs.
multiSearch (18 variations per city).
Session pool (proxy IP reuse).
CSV/JSON export.