
Booking Scraper
Pricing
$29.00/month + usage

Booking Scraper
This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction limits.
4.3 (2)
Pricing
$29.00/month + usage
2
22
10
Last modified
5 days ago
🏨 Booking.com Hotel Scraper - Advanced Web Scraping Actor
Professional-grade Booking.com scraper built with Playwright and Apify SDK. Extract comprehensive hotel data including prices, ratings, addresses, coordinates, and direct booking links with anti-detection measures.
🚀 Features
Core Functionality
- Comprehensive Hotel Data Extraction: Name, price, rating, address, coordinates, and direct booking links
- Advanced Anti-Detection: Human-like scrolling, mouse movements, and browser fingerprinting
- Flexible Extraction Limits: Configurable maximum hotels with global counter tracking
- Batch Processing: Process hotels in configurable batches for optimal performance
- Detailed Mode: Visit individual hotel pages for complete addresses and precise coordinates
Technical Capabilities
- Robust Navigation: Multiple fallback selectors for dynamic Booking.com interface
- Proxy Support: Apify Proxy integration with residential IPs and country selection
- Error Handling: Comprehensive retry logic and graceful failure recovery
- Real-time Logging: Detailed progress tracking and debugging information
- Memory Efficient: Optimized for large-scale scraping operations
Data Quality
- Address Validation: Clean and validate extracted addresses
- Coordinate Extraction: Multiple methods to find precise GPS coordinates
- Price Normalization: Consistent price formatting when available; gracefully handles unavailable prices
- Rating Accuracy: Extract guest ratings, qualitative review labels, review counts, and location scores
- Rich Snippet Capture: Captures Booking.com's headline description so you can display contextual property summaries
📋 Input Parameters
Parameter | Type | Default | Description |
---|---|---|---|
destination | string | "Paris" | City or location to search for hotels |
maxHotels | number | 100 | Maximum hotels to extract (0 = unlimited) |
batchSize | number | 10 | Hotels per batch for processing |
getDetails | boolean | false | Visit individual hotel pages for detailed data |
startUrls | array | ["https://www.booking.com"] | Starting URLs for crawling |
proxyConfiguration | object | See below | Proxy settings for anti-detection |
newUrlFunction | string | - | Custom proxy URL function (advanced) |
Proxy Configuration
{"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "FR"}
📥 Example Input
Basic Usage
{"destination": "New York","maxHotels": 50,"batchSize": 10}
Advanced Usage with Proxy
{"destination": "Tokyo","maxHotels": 100,"batchSize": 5,"getDetails": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "JP"}}
Custom Start URLs
{"destination": "London","maxHotels": 25,"startUrls": ["https://www.booking.com/searchresults.html?ss=London"]}
📤 Output Format
Standard Mode Output
{"hotelName": "Hôtel l'Inattendu","price": null,"rating": "9.3","reviewText": "Wonderful","reviewCount": "81","locationScore": "9.5","description": "Hôtel l'Inattendu 6th arr., Paris 2.2 Subway AccessThe hotel Chaplain Rive Gauche is located in central Paris, 1148 feet from Jardin du Luxembourg and 1969 feet from Montparnasse. Scored 9.3 9.3Wonderful 81 reviewsLocation 9.5","imageUrl": "https://cf.bstatic.com/xdata/images/hotel/square240/739999722.webp?k=dfdbee513e9c35d594db2ef2817546074d7737cdcff7d3f5ca4ee6be4ea7b3da&o=","address": "6th arr., Paris","latitude": null,"longitude": null,"hotelLink": "https://www.booking.com/hotel/fr/hotel-inattendu.html","scrapedAt": "2025-10-02T04:37:24.053Z","pageType": "search_results"}
The description
field preserves Booking.com's headline blurb (with light cleanup), so it may include location context, transport hints, and review snapshots exactly as visitors see them on the card.
Detailed Mode Output (with getDetails: true
)
{"hotelName": "The Ritz London","price": "£450","rating": "9.2","address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom","detailedAddress": "150 Piccadilly, St. James's, Westminster, London, W1J 9BR, United Kingdom","hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html","latitude": 51.5074,"longitude": -0.1378,"scrapedAt": "2025-01-15T10:30:00.000Z","pageType": "hotel_details","hotelIndex": 1}
🛠️ Usage Guide
1. Deploy on Apify Platform
- Upload this actor to your Apify account
- Configure input parameters in the web interface
- Run the actor and monitor progress
- Download results from the dataset
2. Local Development
# Install dependenciesnpm install# Run locally with Apify CLIapify run# Run with custom inputapify run --input='{"destination": "Paris", "maxHotels": 20}'
3. API Integration
const { Actor } = require('apify');const input = {destination: "Barcelona",maxHotels: 30,getDetails: true};const run = await Actor.call('your-actor-id', { input });
⚙️ Configuration Tips
Performance Optimization
- Batch Size: Use 5-10 for optimal performance vs. speed balance
- Max Hotels: Set realistic limits to avoid timeouts
- Proxy Groups: Use RESIDENTIAL for best success rates
Data Quality
- getDetails: Enable for precise addresses and coordinates
- Destination: Use specific city names for better results
- Proxy Country: Match destination country for local results
Anti-Detection
- Residential Proxies: Essential for reliable scraping
- Batch Processing: Helps avoid rate limiting
- Human-like Behavior: Built-in scrolling and mouse movements
🔧 Advanced Features
Custom Proxy Functions
// Custom proxy URL functionconst newUrlFunction = `return 'http://username:password@proxy.example.com:8080';`;
Multiple Destinations
{"startUrls": ["https://www.booking.com/searchresults.html?ss=Paris","https://www.booking.com/searchresults.html?ss=London"]}
📊 Performance Metrics
- Success Rate: >95% with proper proxy configuration
- Speed: 10-50 hotels per minute (depending on settings)
- Memory Usage: Optimized for large-scale operations
- Reliability: Built-in retry logic and error recovery
🚨 Important Notes
Rate Limiting
- Booking.com has anti-bot measures
- Always use residential proxies
- Respect reasonable request rates
- Monitor for IP blocks
Data Accuracy
- Prices may vary based on availability
- Ratings are real-time from Booking.com
- Addresses are validated and cleaned
- Coordinates are extracted from multiple sources
Legal Compliance
- Respect Booking.com's Terms of Service
- Use data responsibly and ethically
- Consider rate limiting and delays
- Monitor for policy changes
🐛 Troubleshooting
Common Issues
"Destination input not found"
- Booking.com interface changes frequently
- Actor includes multiple fallback selectors
- Try refreshing or using different proxy
"No hotels extracted"
- Check destination spelling
- Verify proxy configuration
- Increase timeout values if needed
"Incomplete data"
- Enable
getDetails
for full addresses - Check proxy country matches destination
- Verify network connectivity
Debug Mode
Enable detailed logging by setting environment variable:
$APIFY_LOG_LEVEL=DEBUG
📈 Use Cases
Market Research
- Hotel price analysis
- Competitive intelligence
- Market trend monitoring
Travel Applications
- Hotel comparison tools
- Travel planning platforms
- Booking aggregators
Data Analysis
- Geographic distribution analysis
- Price correlation studies
- Rating analysis
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🔗 Related Actors
- CNN Top Headlines - Scrape the latest top news headlines from CNN's homepage and article pages with optional full article content extraction.
Built with ❤️ using Apify and Playwright
For support, feature requests, or bug reports, please open an issue in the repository.