
Fast Booking Scraper
voyager/fast-booking-scraper
Scrape Booking with this hotel scraper and get data about accommodation on Booking.com. Extract data by keywords or URLs for hotel prices, ratings, location, number of reviews, stars. Scrape and download data from Booking.com in JSON, Excel, HTML ,and CSV.
This changelog summarizes all changes of the Booking actors provided by the Voyager organization. The specific actors that are affected are listed for each change.
2023-11-27
Features
- Added back the 'overcome 1000 results limit' feature, to activate when the
maxItems
limit is set above 1000 (booking-scraper
,fast-booking-scraper
) - Added full input parsing and validation - might potentially break incorrect inputs (
booking-scraper
,fast-booking-scraper
)
Fixes
- Fixed last page detection for certain page formats (
booking-scraper
,fast-booking-scraper
)
2023-11-24
Features
- Added hotel chain/brand extraction under the
hotelChain
output field (booking-scraper
)
2023-10-25
Features
- Added time of scrape to each extracted place under the
timeOfScrapeISO
field (booking-scraper
,fast-booking-scraper
)- The format is in the ISO 8601 format, i.e.
2023-10-25T12:00:00.000Z
(GMT/UTC+0)
- The format is in the ISO 8601 format, i.e.
2023-10-21
Features
- Added hotel facilities extraction under the
facilities
field (booking-scraper
)
2023-10-06
Fixes
- Fixed parsing of inputted hotel start URLs, which was causing the scraper to fail
2023-08-10
Fixes
- Fixed hotel stars extraction
- Fixed place type extraction
- Fixed "Get more than 1000 results" functionality
- Fixed hotel extraction when
simple: true
- Fixed timeout on new pages - actor should run faster now
- Fixed price range input filter
Features
- Added relative dates for check-in and check-out fields (relative to the run's start date)
- i.e. "1 day", "2 weeks", "6 months"
- note: can only be used in JSON input
- Updated duplicate place detectors from name to place url
- Places with identical names are now scraped properly
- Reworked "Get more than 1000 results" with price ranges and location filters
- Price ranges activate when both
checkIn
andcheckOut
dates are specified (can be relative) - Location filters are default (they are not as accurate as price ranges though)
- Price ranges activate when both
Deprecations
useFilters
field was deprecated, update your input with new fieldovercomeResultsLimit
(Get more than 1000 results)useFilters
will still work for now
2023-04-05
- Handle shared URLs
2023-02-16
- Fixed issue with
stars
extraction
2023-01-30
- Fixed issue with category reviews extraction
- Add
Resorts
option to property types
2023-01-27
- Fixed issue with reviews extraction
2023-01-17
- Rewrite the scraper to Crawlee
- Fixed issue extracting
renderedCurrency
from the website
2023-01-16
- Fixed issue where validation fails because
selected_currency
is not present in the url
2023-01-04
- Fixed issue where 'reviews' are not scraped correctly when
simple: true
2022-12-04
- Fixed missing months in review dates
2022-11-30
- Fixed missing prices and malformed price format
2022-11-29
- Fixed filter parameters in the search URL (
propertyType
andminMaxPrice
)
2022-11-25
- Fixed incorrect currencies in the output - request retries
2022-11-23
- Added support for Booking-generated shared list of properties
2022-10-27
- Excluded
rating: null
results from the output ifminScore
is set
2022-10-12
- Removed preview reviews extraction - all reviews are now extracted from separate pages.
- Decreased timeout error rate by increasing timeout seconds, enabling browser fingerprinting and limiting max concurrency.
2022-06-27
- Fixed
image
extraction from listing page (forsimple: true
scrape parameter) - Added the possibility to combine
useFilters
(circumventing Booking's limit of 1000 results) with scrape filters on property type (hotels, apartments, etc.) or price range
2022-04-03
- Added user reviews extraction from both detail page and reviews pagination pages
- Added category reviews extraction from detail page
- Removed default settings
minScore = 8.4
- Fixed language settings for detail page (
language
input field was not respected) - Fixed
stars
extraction from detail page - Fixed
checkInFrom
andcheckInTo
extraction from detail page - Handled global state with external package
apify-global-store
- Split code into more source files, created
extraction
androutes
folders
2022-01-10
- Fixed rejection of current date in
checkIn
andcheckOut
fields
2021-12-28
- Set custom
minMaxPrice
filter to provide more specific filtering than booking.com API - Added rooms scraping support without
checkIn
andcheckOut
set (simple output with basic info only) - Implemented
useFilters
to overcome 1000 results limit by setting filters one by one and combining them - Refactored
handlePageFunction
2021-11-22
- Fixed broken url search
- Fixed outdated selectors to scrape more detailed info
- Fixed
minMaxPrice
search filter - Maximized results count when
maxPages
is set (includedminScore
andpriceRange
into search url) - Prevented infinite run when no
maxPages
restriction is set
2021-08-24
- Extracted all images
2021-01-22
Features:
- Added screenshots for errors
- Added SessionPool
Fixes:
- Removed broken currency check (the main bug that prevented the scraper to work)
- Fixed scraper getting into infinite error loop
- Major code refactor (will help with future fixes and UX)