
Booking Scraper
Pricing
$99.00 / 1,000 results

Booking Scraper
This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction limits.
5.0 (1)
Pricing
$99.00 / 1,000 results
1
Total users
2
Monthly users
2
Runs succeeded
>99%
Last modified
3 days ago
π¨ Booking.com Hotel Scraper - Advanced Web Scraping Actor
Professional-grade Booking.com scraper built with Playwright and Apify SDK. Extract comprehensive hotel data including prices, ratings, addresses, coordinates, and direct booking links with anti-detection measures.
π Features
Core Functionality
- Comprehensive Hotel Data Extraction: Name, price, rating, address, coordinates, and direct booking links
- Advanced Anti-Detection: Human-like scrolling, mouse movements, and browser fingerprinting
- Flexible Extraction Limits: Configurable maximum hotels with global counter tracking
- Batch Processing: Process hotels in configurable batches for optimal performance
- Detailed Mode: Visit individual hotel pages for complete addresses and precise coordinates
Technical Capabilities
- Robust Navigation: Multiple fallback selectors for dynamic Booking.com interface
- Proxy Support: Apify Proxy integration with residential IPs and country selection
- Error Handling: Comprehensive retry logic and graceful failure recovery
- Real-time Logging: Detailed progress tracking and debugging information
- Memory Efficient: Optimized for large-scale scraping operations
Data Quality
- Address Validation: Clean and validate extracted addresses
- Coordinate Extraction: Multiple methods to find precise GPS coordinates
- Price Normalization: Consistent price formatting across currencies
- Rating Accuracy: Extract verified guest ratings and review counts
π Input Parameters
Parameter | Type | Default | Description |
---|---|---|---|
destination | string | "Paris" | City or location to search for hotels |
maxHotels | number | 100 | Maximum hotels to extract (0 = unlimited) |
batchSize | number | 10 | Hotels per batch for processing |
getDetails | boolean | false | Visit individual hotel pages for detailed data |
startUrls | array | ["https://www.booking.com"] | Starting URLs for crawling |
proxyConfiguration | object | See below | Proxy settings for anti-detection |
newUrlFunction | string | - | Custom proxy URL function (advanced) |
Proxy Configuration
{"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "FR"}
π₯ Example Input
Basic Usage
{"destination": "New York","maxHotels": 50,"batchSize": 10}
Advanced Usage with Proxy
{"destination": "Tokyo","maxHotels": 100,"batchSize": 5,"getDetails": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "JP"}}
Custom Start URLs
{"destination": "London","maxHotels": 25,"startUrls": ["https://www.booking.com/searchresults.html?ss=London"]}
π€ Output Format
Standard Mode Output
{"hotelName": "The Ritz London","price": "Β£450","rating": "9.2","address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom","hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html","latitude": 51.5074,"longitude": -0.1378,"scrapedAt": "2025-01-15T10:30:00.000Z","pageType": "search_results","hotelIndex": 1}
Detailed Mode Output (with getDetails: true
)
{"hotelName": "The Ritz London","price": "Β£450","rating": "9.2","address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom","detailedAddress": "150 Piccadilly, St. James's, Westminster, London, W1J 9BR, United Kingdom","hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html","latitude": 51.5074,"longitude": -0.1378,"scrapedAt": "2025-01-15T10:30:00.000Z","pageType": "hotel_details","hotelIndex": 1}
π οΈ Usage Guide
1. Deploy on Apify Platform
- Upload this actor to your Apify account
- Configure input parameters in the web interface
- Run the actor and monitor progress
- Download results from the dataset
2. Local Development
# Install dependenciesnpm install# Run locally with Apify CLIapify run# Run with custom inputapify run --input='{"destination": "Paris", "maxHotels": 20}'
3. API Integration
const { Actor } = require('apify');const input = {destination: "Barcelona",maxHotels: 30,getDetails: true};const run = await Actor.call('your-actor-id', { input });
βοΈ Configuration Tips
Performance Optimization
- Batch Size: Use 5-10 for optimal performance vs. speed balance
- Max Hotels: Set realistic limits to avoid timeouts
- Proxy Groups: Use RESIDENTIAL for best success rates
Data Quality
- getDetails: Enable for precise addresses and coordinates
- Destination: Use specific city names for better results
- Proxy Country: Match destination country for local results
Anti-Detection
- Residential Proxies: Essential for reliable scraping
- Batch Processing: Helps avoid rate limiting
- Human-like Behavior: Built-in scrolling and mouse movements
π§ Advanced Features
Custom Proxy Functions
// Custom proxy URL functionconst newUrlFunction = `return 'http://username:password@proxy.example.com:8080';`;
Multiple Destinations
{"startUrls": ["https://www.booking.com/searchresults.html?ss=Paris","https://www.booking.com/searchresults.html?ss=London"]}
π Performance Metrics
- Success Rate: >95% with proper proxy configuration
- Speed: 10-50 hotels per minute (depending on settings)
- Memory Usage: Optimized for large-scale operations
- Reliability: Built-in retry logic and error recovery
π¨ Important Notes
Rate Limiting
- Booking.com has anti-bot measures
- Always use residential proxies
- Respect reasonable request rates
- Monitor for IP blocks
Data Accuracy
- Prices may vary based on availability
- Ratings are real-time from Booking.com
- Addresses are validated and cleaned
- Coordinates are extracted from multiple sources
Legal Compliance
- Respect Booking.com's Terms of Service
- Use data responsibly and ethically
- Consider rate limiting and delays
- Monitor for policy changes
π Troubleshooting
Common Issues
"Destination input not found"
- Booking.com interface changes frequently
- Actor includes multiple fallback selectors
- Try refreshing or using different proxy
"No hotels extracted"
- Check destination spelling
- Verify proxy configuration
- Increase timeout values if needed
"Incomplete data"
- Enable
getDetails
for full addresses - Check proxy country matches destination
- Verify network connectivity
Debug Mode
Enable detailed logging by setting environment variable:
$APIFY_LOG_LEVEL=DEBUG
π Use Cases
Market Research
- Hotel price analysis
- Competitive intelligence
- Market trend monitoring
Travel Applications
- Hotel comparison tools
- Travel planning platforms
- Booking aggregators
Data Analysis
- Geographic distribution analysis
- Price correlation studies
- Rating analysis
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Related Actors
- CNN Top Headlines - Scrape the latest top news headlines from CNN's homepage and article pages with optional full article content extraction.
Built with β€οΈ using Apify and Playwright
For support, feature requests, or bug reports, please open an issue in the repository.