Booking Scraper
Pricing
$29.00/month + usage
Booking Scraper
This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction limits.
Pricing
$29.00/month + usage
Rating
3.1
(3)
Developer

scraping automation
Actor stats
3
Bookmarked
54
Total users
7
Monthly active users
18 days ago
Last modified
Categories
Share
π¨ Booking.com Hotel Scraper - Advanced Web Scraping Actor
Professional-grade Booking.com scraper built with Playwright and Apify SDK. Extract comprehensive hotel data including prices, ratings, addresses, coordinates, room information, and direct booking links with advanced anti-detection measures using Camoufox.
π Features
Core Functionality
- Comprehensive Hotel Data Extraction: Name, price, rating, address, coordinates, and direct booking links
- Room Data Extraction: Extract detailed room information including room types, bed types, prices, capacity, and features (when
getDetails: true) - Advanced Anti-Detection: Human-like scrolling, mouse movements, and browser fingerprinting with Camoufox
- Flexible Extraction Limits: Configurable maximum hotels with global counter tracking
- Batch Processing: Process hotels in configurable batches for optimal performance
- Detailed Mode: Visit individual hotel pages for complete addresses, precise coordinates, and room details
Technical Capabilities
- Robust Navigation: Multiple fallback selectors for dynamic Booking.com interface
- Proxy Support: Apify Proxy integration with residential IPs and country selection
- Error Handling: Comprehensive retry logic and graceful failure recovery
- Real-time Logging: Detailed progress tracking and debugging information
- Memory Efficient: Optimized for large-scale scraping operations
Data Quality
- Address Validation: Clean and validate extracted addresses
- Coordinate Extraction: Multiple methods to find precise GPS coordinates
- Price Normalization: Consistent price formatting when available; gracefully handles unavailable prices
- Rating Accuracy: Extract guest ratings, qualitative review labels, review counts, and location scores
- Rich Snippet Capture: Captures Booking.com's headline description so you can display contextual property summaries
- Room Information: Detailed room data including types, bed configurations, capacity, prices, and amenities
π Input Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
destination | string | "Paris" | City or location to search for hotels |
maxHotels | number | 10 | Maximum hotels to extract (0 = unlimited) |
batchSize | number | 10 | Hotels per batch for processing |
getDetails | boolean | true | Visit individual hotel pages for detailed data and room information |
timeout | number | 600 | Maximum time in seconds for request handler timeout (300-3600). Note: The Actor Run Timeout (in Settings) must be higher than this value. |
startUrls | array | ["https://www.booking.com"] | Starting URLs for crawling |
proxyConfiguration | object | See below | Proxy settings for anti-detection |
newUrlFunction | string | - | Custom proxy URL function (advanced) |
checkin | string | "2025-07-06" | Check-in date (YYYY-MM-DD format) |
checkout | string | "2025-07-07" | Check-out date (YYYY-MM-DD format) |
groupAdults | string | "2" | Number of adults |
groupChildren | string | "0" | Number of children |
noRooms | string | "1" | Number of rooms |
priceMin | number | - | Minimum price per night (β¬) |
priceMax | number | - | Maximum price per night (β¬) |
stars | array | - | Star rating filter (1-5), e.g. ["4", "5"] |
minReviewScore | number | - | Minimum review score (0-10) |
freeCancellation | boolean | false | Filter for free cancellation |
petsAllowed | boolean | false | Filter for pets allowed |
adultsOnly | boolean | false | Filter for adults-only properties |
districts | array | - | Filter by districts/neighborhoods |
maxDistanceFromCenter | number | - | Max distance from center (km) |
customFilters | array | - | Custom filters using Booking.com nflt codes |
Proxy Configuration
{"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "FR"}
π₯ Example Input
Basic Usage
{"destination": "New York","maxHotels": 50,"batchSize": 10}
Advanced Usage with Proxy and Filters
{"destination": "Tokyo","maxHotels": 50,"batchSize": 5,"getDetails": true,"checkin": "2025-07-15","checkout": "2025-07-20","groupAdults": "2","priceMin": 100,"priceMax": 300,"stars": ["4", "5"],"minReviewScore": 8,"freeCancellation": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "JP"}}
Custom Start URLs
{"destination": "London","maxHotels": 25,"startUrls": ["https://www.booking.com/searchresults.html?ss=London"]}
π€ Output Format
Standard Mode Output
{"hotelName": "HΓ΄tel l'Inattendu","price": null,"rating": "9.3","reviewText": "Wonderful","reviewCount": "81","locationScore": "9.5","description": "HΓ΄tel l'Inattendu 6th arr., Paris 2.2 Subway AccessThe hotel Chaplain Rive Gauche is located in central Paris, 1148 feet from Jardin du Luxembourg and 1969 feet from Montparnasse. Scored 9.3 9.3Wonderful 81 reviewsLocation 9.5","imageUrl": "https://cf.bstatic.com/xdata/images/hotel/square240/739999722.webp?k=dfdbee513e9c35d594db2ef2817546074d7737cdcff7d3f5ca4ee6be4ea7b3da&o=","address": "6th arr., Paris","latitude": null,"longitude": null,"hotelLink": "https://www.booking.com/hotel/fr/hotel-inattendu.html","rooms": null,"scrapedAt": "2025-10-02T04:37:24.053Z","pageType": "search_results","detailRequested": false,"detailFetched": false}
The description field preserves Booking.com's headline blurb (with light cleanup), so it may include location context, transport hints, and review snapshots exactly as visitors see them on the card.
Detailed Mode Output (with getDetails: true)
{"hotelName": "The Ritz London","price": "Β£450","rating": "9.2","address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom","latitude": 51.5074,"longitude": -0.1378,"hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html","rooms": [{"roomType": "Deluxe Double Room","bedType": "1 double bed","persons": 2,"price": 450,"currency": "Β£","available": true,"features": ["WiFi", "TV", "Air conditioning", "Minibar"]},{"roomType": "Executive Suite","bedType": "1 king bed","persons": 2,"price": 750,"currency": "Β£","available": true,"features": ["WiFi", "TV", "Air conditioning", "Minibar", "Balcony"]}],"scrapedAt": "2025-01-15T10:30:00.000Z","pageType": "search_results","detailFetched": true,"detailAddress": "150 Piccadilly, St. James's, Westminster, London, W1J 9BR, United Kingdom","detailLatitude": 51.5074,"detailLongitude": -0.1378,"detailScrapedAt": "2025-01-15T10:30:00.000Z","detailBreadcrumbs": [],"detailQualityRating": "5 out of 5 stars","detailIsPreferredPartner": false,"detailDescription": "Luxury hotel in the heart of London...","detailPopularFacilities": ["WiFi", "Parking", "Restaurant", "Spa"],"detailPropertyHighlights": [],"detailSustainability": [],"detailReviewBreakdown": [],"detailFeaturedReviews": [],"detailNearby": [],"detailFaq": []}
Note: The rooms field contains an array of room objects when getDetails: true is enabled. Each room object includes:
roomType: Type of room (e.g., "Double Room", "Suite")bedType: Bed configuration (e.g., "1 double bed", "2 single beds")persons: Maximum capacity in personsprice: Room price (numeric)currency: Currency symbol (β¬, $, Β£, etc.)available: Boolean indicating room availabilityfeatures: Array of room features/amenities
π οΈ Usage Guide
1. Deploy on Apify Platform
- Upload this actor to your Apify account
- Configure input parameters in the web interface
- Run the actor and monitor progress
- Download results from the dataset
2. Local Development
# Install dependenciesnpm install# Run locally with Apify CLIapify run# Run with custom inputapify run --input='{"destination": "Paris", "maxHotels": 20}'
3. API Integration
const { Actor } = require('apify');const input = {destination: "Barcelona",maxHotels: 30,getDetails: true};const run = await Actor.call('your-actor-id', { input });
βοΈ Configuration Tips
Performance Optimization
- Batch Size: Use 5-10 for optimal performance vs. speed balance
- Max Hotels: Set realistic limits to avoid timeouts
- Proxy Groups: Use RESIDENTIAL for best success rates
Data Quality
- getDetails: Enable for precise addresses, coordinates, and room information
- Destination: Use specific city names for better results
- Proxy Country: Match destination country for local results
- Room Data: Room information is only extracted when
getDetails: trueis enabled
Anti-Detection
- Residential Proxies: Essential for reliable scraping
- Batch Processing: Helps avoid rate limiting
- Human-like Behavior: Built-in scrolling and mouse movements
π§ Advanced Features
Custom Proxy Functions
// Custom proxy URL functionconst newUrlFunction = `return 'http://username:password@proxy.example.com:8080';`;
Multiple Destinations
{"startUrls": ["https://www.booking.com/searchresults.html?ss=Paris","https://www.booking.com/searchresults.html?ss=London"]}
Advanced Filtering
{"destination": "Paris","maxHotels": 50,"getDetails": true,"checkin": "2025-07-15","checkout": "2025-07-20","priceMin": 100,"priceMax": 300,"stars": ["4", "5"],"minReviewScore": 8,"freeCancellation": true,"petsAllowed": false,"districts": ["Le Marais", "5e arr."],"maxDistanceFromCenter": 3,"customFilters": ["ht_id=204", "ht_id=207"]}
Filter Options:
priceMin/priceMax: Filter by price range (β¬ per night)stars: Filter by star rating (1-5), can specify multiple:["4", "5"]minReviewScore: Minimum review score (0-10)freeCancellation: Filter for properties with free cancellationpetsAllowed: Filter for properties that allow petsadultsOnly: Filter for adults-only propertiesdistricts: Filter by specific districts/neighborhoodsmaxDistanceFromCenter: Maximum distance from city center (km)customFilters: Custom filters using Booking.com nflt codes (advanced)
π Performance Metrics
- Success Rate: >95% with proper proxy configuration
- Speed: 10-50 hotels per minute (depending on settings)
- Memory Usage: Optimized for large-scale operations
- Reliability: Built-in retry logic and error recovery
- Room Extraction: Available when
getDetails: trueis enabled - Default Timeout: 3600 seconds (1 hour) for long-running extractions
π¨ Important Notes
Rate Limiting
- Booking.com has anti-bot measures
- Always use residential proxies
- Respect reasonable request rates
- Monitor for IP blocks
Data Accuracy
- Prices may vary based on availability
- Ratings are real-time from Booking.com
- Addresses are validated and cleaned
- Coordinates are extracted from multiple sources
Legal Compliance
- Respect Booking.com's Terms of Service
- Use data responsibly and ethically
- Consider rate limiting and delays
- Monitor for policy changes
π Troubleshooting
Common Issues
"Destination input not found"
- Booking.com interface changes frequently
- Actor includes multiple fallback selectors
- Try refreshing or using different proxy
"No hotels extracted"
- Check destination spelling
- Verify proxy configuration
- Increase timeout values if needed
"Incomplete data"
- Enable
getDetailsfor full addresses, coordinates, and room data - Check proxy country matches destination
- Verify network connectivity
"Actor run timeout of 300 seconds"
- This error occurs when the Actor Run Timeout is set to 300s instead of the default 3600s
- Solution: Check Actor Settings β Default Run Options β Timeout
- Ensure it's set to 3600 seconds (1 hour) or higher
- If launching via API, ensure
timeoutin run options is β₯ 3600
- Note: The
timeoutinput parameter (300-3600s) is for request handler timeout, not the Actor Run Timeout - The Actor Run Timeout must always be higher than the request handler timeout
"No rooms extracted"
- Room extraction requires
getDetails: true - Booking.com may not display room information for all properties
- Room data structure may vary by hotel type and availability
Debug Mode
Enable detailed logging by setting environment variable:
$APIFY_LOG_LEVEL=DEBUG
π Use Cases
Market Research
- Hotel price analysis and room pricing comparison
- Competitive intelligence
- Market trend monitoring
- Room availability and capacity analysis
Travel Applications
- Hotel comparison tools with room details
- Travel planning platforms
- Booking aggregators
- Room search and filtering
Data Analysis
- Geographic distribution analysis
- Price correlation studies
- Rating analysis
- Room type and capacity analysis
- Amenity and feature comparison
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Related Actors
- CNN Top Headlines - Scrape the latest top news headlines from CNN's homepage and article pages with optional full article content extraction.
Built with β€οΈ using Apify and Playwright
For support, feature requests, or bug reports, please open an issue in the repository.