Booking Scraper avatar
Booking Scraper

Pricing

$29.00/month + usage

Go to Apify Store
Booking Scraper

Booking Scraper

This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction limits.

Pricing

$29.00/month + usage

Rating

3.1

(3)

Developer

scraping automation

scraping automation

Maintained by Community

Actor stats

3

Bookmarked

54

Total users

7

Monthly active users

18 days ago

Last modified

Share

🏨 Booking.com Hotel Scraper - Advanced Web Scraping Actor

Professional-grade Booking.com scraper built with Playwright and Apify SDK. Extract comprehensive hotel data including prices, ratings, addresses, coordinates, room information, and direct booking links with advanced anti-detection measures using Camoufox.

Apify Playwright LICENSE

πŸš€ Features

Core Functionality

  • Comprehensive Hotel Data Extraction: Name, price, rating, address, coordinates, and direct booking links
  • Room Data Extraction: Extract detailed room information including room types, bed types, prices, capacity, and features (when getDetails: true)
  • Advanced Anti-Detection: Human-like scrolling, mouse movements, and browser fingerprinting with Camoufox
  • Flexible Extraction Limits: Configurable maximum hotels with global counter tracking
  • Batch Processing: Process hotels in configurable batches for optimal performance
  • Detailed Mode: Visit individual hotel pages for complete addresses, precise coordinates, and room details

Technical Capabilities

  • Robust Navigation: Multiple fallback selectors for dynamic Booking.com interface
  • Proxy Support: Apify Proxy integration with residential IPs and country selection
  • Error Handling: Comprehensive retry logic and graceful failure recovery
  • Real-time Logging: Detailed progress tracking and debugging information
  • Memory Efficient: Optimized for large-scale scraping operations

Data Quality

  • Address Validation: Clean and validate extracted addresses
  • Coordinate Extraction: Multiple methods to find precise GPS coordinates
  • Price Normalization: Consistent price formatting when available; gracefully handles unavailable prices
  • Rating Accuracy: Extract guest ratings, qualitative review labels, review counts, and location scores
  • Rich Snippet Capture: Captures Booking.com's headline description so you can display contextual property summaries
  • Room Information: Detailed room data including types, bed configurations, capacity, prices, and amenities

πŸ“‹ Input Parameters

ParameterTypeDefaultDescription
destinationstring"Paris"City or location to search for hotels
maxHotelsnumber10Maximum hotels to extract (0 = unlimited)
batchSizenumber10Hotels per batch for processing
getDetailsbooleantrueVisit individual hotel pages for detailed data and room information
timeoutnumber600Maximum time in seconds for request handler timeout (300-3600). Note: The Actor Run Timeout (in Settings) must be higher than this value.
startUrlsarray["https://www.booking.com"]Starting URLs for crawling
proxyConfigurationobjectSee belowProxy settings for anti-detection
newUrlFunctionstring-Custom proxy URL function (advanced)
checkinstring"2025-07-06"Check-in date (YYYY-MM-DD format)
checkoutstring"2025-07-07"Check-out date (YYYY-MM-DD format)
groupAdultsstring"2"Number of adults
groupChildrenstring"0"Number of children
noRoomsstring"1"Number of rooms
priceMinnumber-Minimum price per night (€)
priceMaxnumber-Maximum price per night (€)
starsarray-Star rating filter (1-5), e.g. ["4", "5"]
minReviewScorenumber-Minimum review score (0-10)
freeCancellationbooleanfalseFilter for free cancellation
petsAllowedbooleanfalseFilter for pets allowed
adultsOnlybooleanfalseFilter for adults-only properties
districtsarray-Filter by districts/neighborhoods
maxDistanceFromCenternumber-Max distance from center (km)
customFiltersarray-Custom filters using Booking.com nflt codes

Proxy Configuration

{
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "FR"
}

πŸ“₯ Example Input

Basic Usage

{
"destination": "New York",
"maxHotels": 50,
"batchSize": 10
}

Advanced Usage with Proxy and Filters

{
"destination": "Tokyo",
"maxHotels": 50,
"batchSize": 5,
"getDetails": true,
"checkin": "2025-07-15",
"checkout": "2025-07-20",
"groupAdults": "2",
"priceMin": 100,
"priceMax": 300,
"stars": ["4", "5"],
"minReviewScore": 8,
"freeCancellation": true,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "JP"
}
}

Custom Start URLs

{
"destination": "London",
"maxHotels": 25,
"startUrls": [
"https://www.booking.com/searchresults.html?ss=London"
]
}

πŸ“€ Output Format

Standard Mode Output

{
"hotelName": "HΓ΄tel l'Inattendu",
"price": null,
"rating": "9.3",
"reviewText": "Wonderful",
"reviewCount": "81",
"locationScore": "9.5",
"description": "HΓ΄tel l'Inattendu 6th arr., Paris 2.2 Subway AccessThe hotel Chaplain Rive Gauche is located in central Paris, 1148 feet from Jardin du Luxembourg and 1969 feet from Montparnasse. Scored 9.3 9.3Wonderful 81 reviewsLocation 9.5",
"imageUrl": "https://cf.bstatic.com/xdata/images/hotel/square240/739999722.webp?k=dfdbee513e9c35d594db2ef2817546074d7737cdcff7d3f5ca4ee6be4ea7b3da&o=",
"address": "6th arr., Paris",
"latitude": null,
"longitude": null,
"hotelLink": "https://www.booking.com/hotel/fr/hotel-inattendu.html",
"rooms": null,
"scrapedAt": "2025-10-02T04:37:24.053Z",
"pageType": "search_results",
"detailRequested": false,
"detailFetched": false
}

The description field preserves Booking.com's headline blurb (with light cleanup), so it may include location context, transport hints, and review snapshots exactly as visitors see them on the card.

Detailed Mode Output (with getDetails: true)

{
"hotelName": "The Ritz London",
"price": "Β£450",
"rating": "9.2",
"address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom",
"latitude": 51.5074,
"longitude": -0.1378,
"hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html",
"rooms": [
{
"roomType": "Deluxe Double Room",
"bedType": "1 double bed",
"persons": 2,
"price": 450,
"currency": "Β£",
"available": true,
"features": ["WiFi", "TV", "Air conditioning", "Minibar"]
},
{
"roomType": "Executive Suite",
"bedType": "1 king bed",
"persons": 2,
"price": 750,
"currency": "Β£",
"available": true,
"features": ["WiFi", "TV", "Air conditioning", "Minibar", "Balcony"]
}
],
"scrapedAt": "2025-01-15T10:30:00.000Z",
"pageType": "search_results",
"detailFetched": true,
"detailAddress": "150 Piccadilly, St. James's, Westminster, London, W1J 9BR, United Kingdom",
"detailLatitude": 51.5074,
"detailLongitude": -0.1378,
"detailScrapedAt": "2025-01-15T10:30:00.000Z",
"detailBreadcrumbs": [],
"detailQualityRating": "5 out of 5 stars",
"detailIsPreferredPartner": false,
"detailDescription": "Luxury hotel in the heart of London...",
"detailPopularFacilities": ["WiFi", "Parking", "Restaurant", "Spa"],
"detailPropertyHighlights": [],
"detailSustainability": [],
"detailReviewBreakdown": [],
"detailFeaturedReviews": [],
"detailNearby": [],
"detailFaq": []
}

Note: The rooms field contains an array of room objects when getDetails: true is enabled. Each room object includes:

  • roomType: Type of room (e.g., "Double Room", "Suite")
  • bedType: Bed configuration (e.g., "1 double bed", "2 single beds")
  • persons: Maximum capacity in persons
  • price: Room price (numeric)
  • currency: Currency symbol (€, $, Β£, etc.)
  • available: Boolean indicating room availability
  • features: Array of room features/amenities

πŸ› οΈ Usage Guide

1. Deploy on Apify Platform

  1. Upload this actor to your Apify account
  2. Configure input parameters in the web interface
  3. Run the actor and monitor progress
  4. Download results from the dataset

2. Local Development

# Install dependencies
npm install
# Run locally with Apify CLI
apify run
# Run with custom input
apify run --input='{"destination": "Paris", "maxHotels": 20}'

3. API Integration

const { Actor } = require('apify');
const input = {
destination: "Barcelona",
maxHotels: 30,
getDetails: true
};
const run = await Actor.call('your-actor-id', { input });

βš™οΈ Configuration Tips

Performance Optimization

  • Batch Size: Use 5-10 for optimal performance vs. speed balance
  • Max Hotels: Set realistic limits to avoid timeouts
  • Proxy Groups: Use RESIDENTIAL for best success rates

Data Quality

  • getDetails: Enable for precise addresses, coordinates, and room information
  • Destination: Use specific city names for better results
  • Proxy Country: Match destination country for local results
  • Room Data: Room information is only extracted when getDetails: true is enabled

Anti-Detection

  • Residential Proxies: Essential for reliable scraping
  • Batch Processing: Helps avoid rate limiting
  • Human-like Behavior: Built-in scrolling and mouse movements

πŸ”§ Advanced Features

Custom Proxy Functions

// Custom proxy URL function
const newUrlFunction = `
return 'http://username:password@proxy.example.com:8080';
`;

Multiple Destinations

{
"startUrls": [
"https://www.booking.com/searchresults.html?ss=Paris",
"https://www.booking.com/searchresults.html?ss=London"
]
}

Advanced Filtering

{
"destination": "Paris",
"maxHotels": 50,
"getDetails": true,
"checkin": "2025-07-15",
"checkout": "2025-07-20",
"priceMin": 100,
"priceMax": 300,
"stars": ["4", "5"],
"minReviewScore": 8,
"freeCancellation": true,
"petsAllowed": false,
"districts": ["Le Marais", "5e arr."],
"maxDistanceFromCenter": 3,
"customFilters": ["ht_id=204", "ht_id=207"]
}

Filter Options:

  • priceMin / priceMax: Filter by price range (€ per night)
  • stars: Filter by star rating (1-5), can specify multiple: ["4", "5"]
  • minReviewScore: Minimum review score (0-10)
  • freeCancellation: Filter for properties with free cancellation
  • petsAllowed: Filter for properties that allow pets
  • adultsOnly: Filter for adults-only properties
  • districts: Filter by specific districts/neighborhoods
  • maxDistanceFromCenter: Maximum distance from city center (km)
  • customFilters: Custom filters using Booking.com nflt codes (advanced)

πŸ“Š Performance Metrics

  • Success Rate: >95% with proper proxy configuration
  • Speed: 10-50 hotels per minute (depending on settings)
  • Memory Usage: Optimized for large-scale operations
  • Reliability: Built-in retry logic and error recovery
  • Room Extraction: Available when getDetails: true is enabled
  • Default Timeout: 3600 seconds (1 hour) for long-running extractions

🚨 Important Notes

Rate Limiting

  • Booking.com has anti-bot measures
  • Always use residential proxies
  • Respect reasonable request rates
  • Monitor for IP blocks

Data Accuracy

  • Prices may vary based on availability
  • Ratings are real-time from Booking.com
  • Addresses are validated and cleaned
  • Coordinates are extracted from multiple sources
  • Respect Booking.com's Terms of Service
  • Use data responsibly and ethically
  • Consider rate limiting and delays
  • Monitor for policy changes

πŸ› Troubleshooting

Common Issues

"Destination input not found"

  • Booking.com interface changes frequently
  • Actor includes multiple fallback selectors
  • Try refreshing or using different proxy

"No hotels extracted"

  • Check destination spelling
  • Verify proxy configuration
  • Increase timeout values if needed

"Incomplete data"

  • Enable getDetails for full addresses, coordinates, and room data
  • Check proxy country matches destination
  • Verify network connectivity

"Actor run timeout of 300 seconds"

  • This error occurs when the Actor Run Timeout is set to 300s instead of the default 3600s
  • Solution: Check Actor Settings β†’ Default Run Options β†’ Timeout
    • Ensure it's set to 3600 seconds (1 hour) or higher
    • If launching via API, ensure timeout in run options is β‰₯ 3600
  • Note: The timeout input parameter (300-3600s) is for request handler timeout, not the Actor Run Timeout
  • The Actor Run Timeout must always be higher than the request handler timeout

"No rooms extracted"

  • Room extraction requires getDetails: true
  • Booking.com may not display room information for all properties
  • Room data structure may vary by hotel type and availability

Debug Mode

Enable detailed logging by setting environment variable:

$APIFY_LOG_LEVEL=DEBUG

πŸ“ˆ Use Cases

Market Research

  • Hotel price analysis and room pricing comparison
  • Competitive intelligence
  • Market trend monitoring
  • Room availability and capacity analysis

Travel Applications

  • Hotel comparison tools with room details
  • Travel planning platforms
  • Booking aggregators
  • Room search and filtering

Data Analysis

  • Geographic distribution analysis
  • Price correlation studies
  • Rating analysis
  • Room type and capacity analysis
  • Amenity and feature comparison

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

  • CNN Top Headlines - Scrape the latest top news headlines from CNN's homepage and article pages with optional full article content extraction.

Built with ❀️ using Apify and Playwright

For support, feature requests, or bug reports, please open an issue in the repository.