Booking Scraper avatar
Booking Scraper

Pricing

$99.00 / 1,000 results

Go to Store
Booking Scraper

Booking Scraper

Developed by

Runtime

Runtime

Maintained by Community

This Apify actor scrapes hotel data from Booking.com. It supports robust navigation, proxy configuration, batch processing, and flexible extraction limits.

5.0 (1)

Pricing

$99.00 / 1,000 results

1

Total users

2

Monthly users

2

Runs succeeded

>99%

Last modified

3 days ago

🏨 Booking.com Hotel Scraper - Advanced Web Scraping Actor

Professional-grade Booking.com scraper built with Playwright and Apify SDK. Extract comprehensive hotel data including prices, ratings, addresses, coordinates, and direct booking links with anti-detection measures.

Apify Playwright LICENSE

πŸš€ Features

Core Functionality

  • Comprehensive Hotel Data Extraction: Name, price, rating, address, coordinates, and direct booking links
  • Advanced Anti-Detection: Human-like scrolling, mouse movements, and browser fingerprinting
  • Flexible Extraction Limits: Configurable maximum hotels with global counter tracking
  • Batch Processing: Process hotels in configurable batches for optimal performance
  • Detailed Mode: Visit individual hotel pages for complete addresses and precise coordinates

Technical Capabilities

  • Robust Navigation: Multiple fallback selectors for dynamic Booking.com interface
  • Proxy Support: Apify Proxy integration with residential IPs and country selection
  • Error Handling: Comprehensive retry logic and graceful failure recovery
  • Real-time Logging: Detailed progress tracking and debugging information
  • Memory Efficient: Optimized for large-scale scraping operations

Data Quality

  • Address Validation: Clean and validate extracted addresses
  • Coordinate Extraction: Multiple methods to find precise GPS coordinates
  • Price Normalization: Consistent price formatting across currencies
  • Rating Accuracy: Extract verified guest ratings and review counts

πŸ“‹ Input Parameters

ParameterTypeDefaultDescription
destinationstring"Paris"City or location to search for hotels
maxHotelsnumber100Maximum hotels to extract (0 = unlimited)
batchSizenumber10Hotels per batch for processing
getDetailsbooleanfalseVisit individual hotel pages for detailed data
startUrlsarray["https://www.booking.com"]Starting URLs for crawling
proxyConfigurationobjectSee belowProxy settings for anti-detection
newUrlFunctionstring-Custom proxy URL function (advanced)

Proxy Configuration

{
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "FR"
}

πŸ“₯ Example Input

Basic Usage

{
"destination": "New York",
"maxHotels": 50,
"batchSize": 10
}

Advanced Usage with Proxy

{
"destination": "Tokyo",
"maxHotels": 100,
"batchSize": 5,
"getDetails": true,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "JP"
}
}

Custom Start URLs

{
"destination": "London",
"maxHotels": 25,
"startUrls": [
"https://www.booking.com/searchresults.html?ss=London"
]
}

πŸ“€ Output Format

Standard Mode Output

{
"hotelName": "The Ritz London",
"price": "Β£450",
"rating": "9.2",
"address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom",
"hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html",
"latitude": 51.5074,
"longitude": -0.1378,
"scrapedAt": "2025-01-15T10:30:00.000Z",
"pageType": "search_results",
"hotelIndex": 1
}

Detailed Mode Output (with getDetails: true)

{
"hotelName": "The Ritz London",
"price": "Β£450",
"rating": "9.2",
"address": "150 Piccadilly, St. James's, London W1J 9BR, United Kingdom",
"detailedAddress": "150 Piccadilly, St. James's, Westminster, London, W1J 9BR, United Kingdom",
"hotelLink": "https://www.booking.com/hotel/gb/the-ritz-london.html",
"latitude": 51.5074,
"longitude": -0.1378,
"scrapedAt": "2025-01-15T10:30:00.000Z",
"pageType": "hotel_details",
"hotelIndex": 1
}

πŸ› οΈ Usage Guide

1. Deploy on Apify Platform

  1. Upload this actor to your Apify account
  2. Configure input parameters in the web interface
  3. Run the actor and monitor progress
  4. Download results from the dataset

2. Local Development

# Install dependencies
npm install
# Run locally with Apify CLI
apify run
# Run with custom input
apify run --input='{"destination": "Paris", "maxHotels": 20}'

3. API Integration

const { Actor } = require('apify');
const input = {
destination: "Barcelona",
maxHotels: 30,
getDetails: true
};
const run = await Actor.call('your-actor-id', { input });

βš™οΈ Configuration Tips

Performance Optimization

  • Batch Size: Use 5-10 for optimal performance vs. speed balance
  • Max Hotels: Set realistic limits to avoid timeouts
  • Proxy Groups: Use RESIDENTIAL for best success rates

Data Quality

  • getDetails: Enable for precise addresses and coordinates
  • Destination: Use specific city names for better results
  • Proxy Country: Match destination country for local results

Anti-Detection

  • Residential Proxies: Essential for reliable scraping
  • Batch Processing: Helps avoid rate limiting
  • Human-like Behavior: Built-in scrolling and mouse movements

πŸ”§ Advanced Features

Custom Proxy Functions

// Custom proxy URL function
const newUrlFunction = `
return 'http://username:password@proxy.example.com:8080';
`;

Multiple Destinations

{
"startUrls": [
"https://www.booking.com/searchresults.html?ss=Paris",
"https://www.booking.com/searchresults.html?ss=London"
]
}

πŸ“Š Performance Metrics

  • Success Rate: >95% with proper proxy configuration
  • Speed: 10-50 hotels per minute (depending on settings)
  • Memory Usage: Optimized for large-scale operations
  • Reliability: Built-in retry logic and error recovery

🚨 Important Notes

Rate Limiting

  • Booking.com has anti-bot measures
  • Always use residential proxies
  • Respect reasonable request rates
  • Monitor for IP blocks

Data Accuracy

  • Prices may vary based on availability
  • Ratings are real-time from Booking.com
  • Addresses are validated and cleaned
  • Coordinates are extracted from multiple sources
  • Respect Booking.com's Terms of Service
  • Use data responsibly and ethically
  • Consider rate limiting and delays
  • Monitor for policy changes

πŸ› Troubleshooting

Common Issues

"Destination input not found"

  • Booking.com interface changes frequently
  • Actor includes multiple fallback selectors
  • Try refreshing or using different proxy

"No hotels extracted"

  • Check destination spelling
  • Verify proxy configuration
  • Increase timeout values if needed

"Incomplete data"

  • Enable getDetails for full addresses
  • Check proxy country matches destination
  • Verify network connectivity

Debug Mode

Enable detailed logging by setting environment variable:

$APIFY_LOG_LEVEL=DEBUG

πŸ“ˆ Use Cases

Market Research

  • Hotel price analysis
  • Competitive intelligence
  • Market trend monitoring

Travel Applications

  • Hotel comparison tools
  • Travel planning platforms
  • Booking aggregators

Data Analysis

  • Geographic distribution analysis
  • Price correlation studies
  • Rating analysis

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

  • CNN Top Headlines - Scrape the latest top news headlines from CNN's homepage and article pages with optional full article content extraction.

Built with ❀️ using Apify and Playwright

For support, feature requests, or bug reports, please open an issue in the repository.