Hotel Price Aggregator (Booking.com + Kayak) avatar

Hotel Price Aggregator (Booking.com + Kayak)

Pricing

from $5.00 / 1,000 results

Go to Apify Store
Hotel Price Aggregator (Booking.com + Kayak)

Hotel Price Aggregator (Booking.com + Kayak)

Compare hotel prices across Booking.com and Kayak in a single scrape. Smart deduplication matches the same hotel across platforms, calculates savings, and exports results as JSON, CSV, or Excel. Ideal for travel agencies, hospitality competitors, and price monitoring.

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

Andy Page

Andy Page

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Hotel Price Aggregator (Booking + Kayak)

Compare hotel prices across Booking.com and Kayak in a single scrape. This hotel price comparison scraper uses smart deduplication to match the same hotel across both platforms, calculates potential savings, and exports results as JSON, CSV, or Excel — ideal for travel agencies, hospitality competitors, and price monitoring.

Why Use This Actor?

  • One tool, two platforms — No need to run separate hotel scrapers for Booking.com and Kayak
  • Smart deduplication — Same hotel on multiple platforms? Merged automatically with price comparison
  • Savings finder — Instantly see which platform has the lowest hotel price and how much you save
  • Travel-ready output — JSON, CSV, or Excel with formatted price comparison sheets
  • Residential proxies — Built-in support for Apify residential proxies to handle aggressive bot detection
  • Browser fingerprinting — Crawlee fingerprint-suite integration for realistic browser profiles

Features

FeatureDescription
Multi-platform hotel searchScrape Booking.com and Kayak simultaneously
Smart deduplicationFuzzy name matching + geo-proximity + star rating validation
Hotel price comparisonSide-by-side pricing across both platforms for each hotel
Flexible filtersFilter by star rating, max price, amenities, property type
Multiple output formatsJSON (default), CSV, Excel with formatted workbooks
Date range supportSearch any check-in/check-out date combination
Guest configurationAdults, children, rooms — all configurable
PaginationAutomatically fetches multiple pages for large result sets
Proxy supportAuto-enables Apify residential proxies on the platform

Input Parameters

ParameterTypeRequiredDefaultDescription
locationstringYesCity or destination (e.g., "Paris, France")
checkInstringYesCheck-in date (YYYY-MM-DD)
checkOutstringYesCheck-out date (YYYY-MM-DD)
guestsobjectNo{adults: 2, children: 0, rooms: 1}Guest configuration
platformsarrayNo["booking", "kayak"]Platforms to scrape
maxResultsPerPlatformintegerNo50Max hotel listings per platform (up to 500)
minStarsintegerNoMinimum star rating (1-5)
maxPricenumberNoMaximum total price in USD
filtersobjectNo{}Advanced filters (amenities, propertyTypes)
deduplicationbooleanNotrueMerge same hotels across platforms
outputFormatstringNo"json"Output format: json, csv, excel
proxyConfigobjectNoResidentialProxy configuration

Advanced Filter Options

Use the filters JSON field for additional filtering:

{
"amenities": ["wifi", "parking", "breakfast"],
"propertyTypes": ["hotel", "apartment"]
}

Or combine with the top-level minStars and maxPrice fields for quick filtering.

Output Schema

Each hotel listing includes:

{
"id": "agg_hotel_paris_001",
"name": "Hotel de la Paix",
"location": {
"address": "123 Rue de Rivoli, 75001 Paris",
"coordinates": { "lat": 48.8606, "lng": 2.3376 },
"neighborhood": "1st Arrondissement",
"city": "Paris, France",
"country": "France"
},
"pricing": {
"totalPrice": 850,
"pricePerNight": 170,
"currency": "USD",
"taxes": 45,
"fees": 15
},
"platformData": {
"booking": {
"url": "https://booking.com/hotel/fr/de-la-paix.html",
"price": 850,
"available": true,
"scrapedAt": "2026-02-15T14:30:00Z"
},
"kayak": {
"url": "https://kayak.com/hotels/...",
"price": 875,
"available": true,
"scrapedAt": "2026-02-15T14:32:00Z"
}
},
"lowestPrice": {
"platform": "booking",
"price": 850
},
"stars": 4,
"rating": 8.7,
"reviewCount": 1234,
"amenities": ["wifi", "parking", "breakfast", "air_conditioning"],
"propertyType": "hotel",
"roomType": "Standard Double Room",
"cancellationPolicy": "Free cancellation until June 10",
"lastUpdated": "2026-02-15T14:32:00Z",
"_platform": "aggregated",
"_matchConfidence": 92
}

Example Use Cases

1. Travel Agency Daily Price Check

Find the best hotel deals across platforms for client bookings:

{
"location": "Rome, Italy",
"checkIn": "2026-08-15",
"checkOut": "2026-08-22",
"guests": { "adults": 2, "rooms": 1 },
"platforms": ["booking", "kayak"],
"maxResultsPerPlatform": 50,
"minStars": 4,
"deduplication": true
}

2. Hospitality Competitor Monitoring

Track competitor hotel pricing in your market:

{
"location": "Miami Beach, FL",
"checkIn": "2026-07-04",
"checkOut": "2026-07-07",
"guests": { "adults": 2, "rooms": 1 },
"platforms": ["booking", "kayak"],
"minStars": 4,
"filters": { "propertyTypes": ["hotel"] },
"deduplication": true,
"outputFormat": "excel"
}

Find affordable hotels with specific amenities:

{
"location": "Barcelona, Spain",
"checkIn": "2026-09-01",
"checkOut": "2026-09-05",
"guests": { "adults": 2, "children": 1, "rooms": 1 },
"platforms": ["booking", "kayak"],
"maxResultsPerPlatform": 100,
"maxPrice": 150,
"filters": { "amenities": ["wifi"] },
"deduplication": true
}

Use just one platform for a fast hotel lookup:

{
"location": "Tokyo, Japan",
"checkIn": "2026-10-01",
"checkOut": "2026-10-05",
"platforms": ["booking"],
"maxResultsPerPlatform": 25
}

5. Enterprise Data Feed

Large-scale hotel data extraction for price comparison apps:

{
"location": "London, UK",
"checkIn": "2026-06-01",
"checkOut": "2026-06-04",
"guests": { "adults": 2, "rooms": 1 },
"platforms": ["booking", "kayak"],
"maxResultsPerPlatform": 200,
"deduplication": true,
"outputFormat": "json"
}

Output Formats

JSON (Default)

Standard Apify dataset output. Each hotel is a JSON object in the dataset.

CSV

Flattened data with proper headers, suitable for Excel/Google Sheets import. Saved to key-value store as output-csv.

Excel

Formatted workbook with two sheets:

  • Hotel Listings: Complete hotel data with all platform prices
  • Price Comparison: Summary with best/worst prices and savings per hotel

Saved to key-value store as output-excel.

How Deduplication Works

When you search across both platforms, the same hotel appears on each one — often with different prices. Our deduplication engine:

  1. Fuzzy name matching — "Hotel de la Paix" matches "Hotel de la Paix Paris" (85%+ similarity threshold)
  2. Geo-proximity check — Hotels within 50 meters are considered the same property
  3. Star rating validation — Must match within +/-0.5 stars
  4. Smart merging — Combines data from both platforms into a single record with complete platformData

Each merged record includes a _matchConfidence score (0-100) so you can verify the accuracy.

Troubleshooting

"0 hotels scraped" or empty results

  • Check your proxy config: Both Booking.com and Kayak require residential proxies for reliable scraping. The actor auto-enables them on the Apify platform, but if you disabled proxies or are running locally, results may be empty.
  • Verify dates: Ensure checkIn and checkOut are future dates in YYYY-MM-DD format.
  • Check the location: Use a well-known city name (e.g., "Paris, France" rather than "CDG Airport Area").
  • Review debug artifacts: The actor saves screenshots and HTML to the key-value store on failure. Check debug-booking-screenshot-*, debug-kayak-screenshot-*, and debug-*-html keys.

Bot detection / CAPTCHA

  • Kayak is particularly aggressive. The actor retries up to 4 times with fresh proxy sessions. If all attempts fail, try again later or reduce maxResultsPerPlatform.
  • Booking.com occasionally shows CAPTCHAs. Residential proxies solve most cases.
  • If issues persist, check the Apify proxy dashboard to confirm your residential proxy quota hasn't been exceeded.

Slow runs

  • Each platform takes 30-90 seconds per page due to realistic browser rendering and anti-bot delays.
  • Tip: Reduce maxResultsPerPlatform for faster runs. 25-50 results is fastest.
  • The actor requests 2048 MB of memory minimum. For large result sets (200+), 4096 MB is recommended.

CSV/Excel output missing

  • CSV and Excel files are saved to the key-value store, not the dataset. Look for output-csv or output-excel keys in the run's key-value store.

Limitations

  • Hotels only — This actor searches for hotel accommodations. It does not support flights, car rentals, or vacation packages.
  • Two platforms — Currently supports Booking.com and Kayak. Expedia was excluded due to infeasible bot detection requirements (see FAQ).
  • Maximum 500 results per platform — Higher limits increase run time and risk of bot detection.
  • Maximum 30-night stays — Date ranges are capped at 30 nights per search.
  • USD pricing — Prices are requested in USD. Other currencies may appear depending on the platform's geo-detection.
  • No real-time availability — Results reflect pricing at scrape time. Prices and availability change frequently on travel sites.
  • Rate-limited — The actor includes per-platform rate limiting (100 req/min for Booking, 30 req/min for Kayak) to avoid IP blocks.

Pricing

Apify PlanPrice per 1,000 listings
Free$5.00
Starter$3.00
Scale / Team$2.00

Compare: Running 2 separate scrapers costs $6-10 per 1,000 listings. This actor saves you 40-60%.

Platform Notes

PlatformMethodProxy RequiredNotes
Booking.comPlaywright (browser)RecommendedMost reliable, DOM + API interception, pagination
KayakPlaywright (browser)Required (residential)JS rendering required, API interception + DOM fallback

Both scrapers use Apify's fingerprint-suite for realistic browser fingerprints (user-agent, viewport, WebGL, Canvas, navigator properties) to minimize bot detection.

Tip: For best results, always use Apify residential proxies. The actor auto-enables residential proxies when running on the Apify platform.

FAQ

Q: How long does a typical run take? A: 50 hotels across 2 platforms typically takes 2-3 minutes. Larger result sets (200+) may take 5-10 minutes.

Q: What if one platform fails? A: The actor is resilient — if Kayak is blocked, you still get results from Booking. Failed platforms show up as error records in the output.

Q: Can I use this without proxies? A: Booking.com may work without proxies for small runs. Kayak requires residential proxies. On the Apify platform, proxies are auto-enabled.

Q: How accurate is the deduplication? A: We use conservative thresholds (85% name match + location verification) to minimize false positives. Each merged record includes a confidence score.

Q: Can I search for flights or car rentals? A: No — this actor is specifically designed for hotel price comparison. Flight and car rental support may be added in a future actor.

Q: Why isn't Expedia included? A: Expedia employs extremely aggressive bot detection (TLS fingerprinting, behavioral analysis) that makes reliable automated scraping infeasible without proprietary solutions. Booking.com and Kayak together provide comprehensive hotel coverage and pricing data.

Q: How often should I update the prefill dates? A: The default check-in/check-out dates in the input schema are set ~6 months in the future. If Apify's automated tests start failing, update the prefill dates in .actor/input_schema.json.

Changelog

v1.0.0 (February 2026)

  • Initial public release
  • Booking.com and Kayak hotel scraping with API interception + DOM fallback
  • Smart deduplication (fuzzy name matching, geo-proximity, star rating)
  • JSON, CSV, and Excel output formats
  • Pagination support for Booking.com (up to 5 pages)
  • Auto-proxy detection on Apify platform
  • Comprehensive input validation and error handling
  • 100 unit tests covering core logic

Support

  • Issues: Report bugs via GitHub issues or Apify community
  • Feature requests: We'd love to hear your ideas — contact us through Apify
  • Enterprise pricing: For 100K+ listings/month, reach out for custom pricing

Built by A Page Ventures | Apify Store