Google Maps Reviews Scraper
Pay $0.50 for 1,000 reviews
Google Maps Reviews Scraper
Pay $0.50 for 1,000 reviews
Extract all reviews of Google Maps places using place URLs. Get review text, published date, response from owner, review URL, and reviewer's details. Download scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.
Do you want to learn more about this Actor?
Get a demo2024-10-29 (0.14.321)
Features
- Added
mountain peak
category
2024-10-01 (0.14.304)
Features
- Added option to skip places based on (not) having
website
2024-09-17 (0.14.300)
Features
- Added
swimming pool repair service
category
2024-09-16 (0.14.297)
Features
- Scraping user notes from public place lists - stored in
userListNote
2024-09-03 (0.14.292)
Fixes
- Added support for various formats of places lists URLs
2024-08-23 (0.14.291)
Features
- Input fields that are related to reviews personal data (
scrapeReviewerName
,scrapeReviewerId
,scrapeReviewerUrl
,scrapeReviewId
,scrapeReviewUrl
) were merged into one -scrapeReviewsPersonalData
.
2024-08-21 (0.14.290)
Features
- Added
vitamin & supplements store
category
2024-08-08 (0.14.282)
Fixes
- Fixed "Failed to parse coordinates" error that was occuring when start URLs didn't contain coordinates
2024-07-16 (0.14.280)
Features
- Added
bookingLinks
2024-07-16 (0.14.277)
Features
- Added
tableReservationLinks
2024-07-16 (0.14.276)
Fixes
- Fixed scraping reviews attributes after Google changed its API
2024-06-03 (0.14.270)
Fixes
- Fix missing review
name
after change from Google. - Fix rare problem with image URL causing place to be skipped.
- Fix scraping from user-provided custom place list.
- Properly parse wrongly encoded search terms in user-provided URLs.
Features
- Enlarge review images. They are now in 1920x1080 resolution.
2024-05-17 (0.14.268)
Features
- Added place's
fid
(feature ID) to the output
2024-05-07 (0.14.267)
Improvements
- Significantly improved
Scrape all places (no search term)
by changing the scraping method.- It is much faster
- It captures more places. We believe it is very close to 100% of the pins visible on the map. We will focus on validating different areas to ensure we are always getting 100% of all visible places.
- It doesn't require using the OCR actor anymore. This removes the extra memory requirement and fixes the issue that if your account runs out of memory, it might have missed some places.
Fixes
- Fixed
maxImages: 1
initiating image scrolling which made the scraper slower. This bug was introduced a few weeks ago. - Fixed consent screen in rare cases blocking a search page (this issue only happened in
google-maps-extractor
)
2024-05-03 (0.14.266)
Features
reviewsStartDate
now supports both relative dates as number and time unit (e.g.2 weeks
,7 months
) and date with time in partial ISO format (e.g.2024-05-03T12:00:00
). Both of these now only work in JSON input but visual component is coming soon.
Fixes
- Fixed
webResults
sometimes timeouting
2024-04-24 (0.14.265)
Features
- added
hotelStars
to search results from search page
2024-04-23 (0.14.265)
Fixes
- Improve capture rate for
all_places_no_search_ocr
. For several past versions, there was a bug that caused the search overlay to block part of the map.
2024-04-19 (0.14.264)
Features
- Added support for relative dates in reviews date filter (JSON only).
- Added filtering places based on their
postalCode
if it's defined in the input. - Added scraping image authors (image attributions). You can enable it by setting
scrapeImageAuthors
totrue
. In output, it will be stored inimages
field, where each item contains,imageUrl
,authorName
,authorUrl
anduploadedAt
. - Added handling of
gps-proxy
images URLs (those were inaccessible to users after scraping finishes) - actor is able to detect and replace them with non-gps URL. - Added
price
andcid
fields to search results from search page
Improvements
- Added categories:
beach
andpublic beach
2024-04-05 (0.14.262)
Features
- You can scrape more
questionsAndAnswers
using a newmaxQuestions
input parameter.
Improvements
- Improve capture rate of search using Google's "Search this area" option. This reveals lesser known places that Google doesn't show in normal searches. On average we see about 5-10% more places for the same area but for some rare search terms it can lead to up to 50% improvement.
Fixes
- Scrape streetview image if a place has no images
BREAKING CHANGE
questionsAndAnswers
is now an array, instead of an object. We try to avoid breaking changes where possible but since thisquestionsAndAnswers
output field is not used often we decided to prioritize data cleanliness.
2024-03-13
Fixes
- Fixed reviews
text
extraction
2024-03-08 (0.14.261)
Features
- Added
reviewOrigin
to reviews - whether the review comes from Google, Tripadvisor or else.
2024-02-15 (0.14.257)
Fixes
- Fixed reviews sorting after Google changed API
Improvements
- Removed any browser interaction when scraping reviews to avoid timeouts. This should make the reviews scraping more stable and slightly faster.
2024-01-29 (0.14.252)
Fixes
- Fixed image scraping
- Fixed processing large amounts of search terms
2024-01-27 (0.14.251)
Fixes
- Fixed
placeMinimumStars
although Google completely removed this search option, the scraper now post-filters the places. This option might be deprecated in the future.
2024-01-09 (0.14.249)
Features
- Added more places categories
2023-12-04 (0.14.246)
Breaking change
- Set minimum allowed memory to 512MB. The actor doesn't work well with less memory and runs were often crashing due to that.
Features
- Added a new
visitedIn
field to the review data in the output. This field provides the date or time period of the reviewer's visit to the place
2023-12-04 (0.14.245)
Fixes
- Fixed issue where scraping images timeouts if place has no images
Features
- Add support for custom place list URLs
2023-11-23 (0.14.244)
Features
- Added
scrapeDirectories
option to scrape places' directories (e.g. places inside malls)
2023-11-02 (0.14.243)
Fixes
neighborhood
should contain correct value- Avoid wrong
price
values
2023-10-26 (0.14.242)
Fixes
- Fix actor sometimes crashing completely. This was caused by previous release, you can resurrect the failed runs now to continue where it left off.
2023-10-25 (0.14.241)
Fixes
- Fix scraping locations that don't have an address
- Increase capture rate for "All places no search" option by about 10% by increasing the browser viewport
2023-10-23 (0.14.239)
Fixes
- Fix search page scrolling after Google changed layout. This bug appeared today and caused that only 20 first places were captured on each search page
2023-10-18 (0.14.237)
Fixes
- Properly handle places with no web results
2023-10-18 (0.14.236)
Fixes
- Fixed geolocation: country resolution
Features
- Added
hotelAds
field to output
2023-10-12 (0.14.232)
Features
- Allow visualizing all scraped places in the "Live view" tab, allowing for real-time updates of the scraped places as the scraper runs.
2023-10-11 (0.14.231)
Fixes
- Fixed reviews sorting after API change
2023-10-09 (0.14.230)
Fixes
- Fixed reviews extraction that broke today after Google completely revamped their API. The new reviews scraping is a bit slower because it is paginating every 20 reviews.
2023-10-06 (0.14.229)
Features
- Added
categoryFilterWords
to select & limit what exact Google categories should be scraped. You can also use the categories instead of search terms.
2023-09-21 (0.14.228)
Fixes
- Update selector for
displayedUrl
in web results
Features
- Released a separate Google Maps Scraper Orchestrator actor. It allows passing a list of locations and returns deduplicated results collected from multiple Google Maps runs. By running multiple runs in the background, it can utilize all your Apify account memory for maximum speed scraping.
2023-09-06 (0.14.227)
Features
- Added
hotelReviewSummary
field to output
2023-08-21 (0.14.226)
Fixes
- Fixed
reviewsFilterString
after Google redesign - Fixed
orderBy
for some specific place/language combinations - Improved
subTitle
extraction by extracting also from JSON data (Google uses this field for various meanings) - Fix formatting of
peopleAlsoSearch
if empty (it is empty array, used to by array of null values)
Features
- Added
imageCategories
field to output
2023-08-08 (0.14.225)
Fixes
- Fixed
orderBy
extraction - Skip search terms that redirect to a 'directions' type of page. This can happen for very long search terms that Google doesn't recognize as a place.
Features
- Added
skipClosedPlaces
to input to skip permanently or temporarily closed places.
Changes
- Scrape places also from partially matched search results (they include a section with
Partial matches
orDon't see what you're looking for?
). These can be the places you want but Google is not sure about it. Usually, it has only about 2 places in the results. You can see a warning message in the log if the match was only partial.
2023-08-01 (0.14.223)
Fixes
- Resolved issue with reviews and image scraping not working properly on some places after Google design change
- Retry "bad query" search page result a few times because it can sometimes work only with specific proxy
2023-07-20 (0.14.222)
Fixes
- Improve handling of searches that redirect to a single place. It is faster and more reliable now. Previously, the scraper retriggered the search which in rare cases might have loaded a different place.
2023-07-19 (0.14.219)
Hotfixes
- Fixed changed layout for the heading of the detail page that completely broke the scraper for a moment
Features
- Visualized results map is now auto updating with newly scraped places after being opened
allPlacesNoSearchAction
can now be used in combination withstartUrls
if they don't contain a search term. This allows scraping only a single map page for all places (no search term).
2023-06-27 (0.14.217)
Features
- Added
placeMinimumStars
input option to only include places that have an average rating higher than the provided input.
Fixes
- Correctly handle places with no location (no lat/lon). We include these places because they are usually relevant to the search.
- Don't cut off an extra review if it is added as we scrape the place or if Google has a wrong counter on the total reviews
2023-06-23 (0.14.215)
Fixes
- Load data into the results map correctly for large datasets (was getting rate limited because of too fast loading).
2023-06-23 (0.14.213)
Features
- Added an HTML page of a map that nicely visualizes all scraped places. It is stored in a Key-Value Store of each run. Currently, you need to reload the page if you want to display new places (if the scraper is still running) Auto-refreshing will come soon.
- Added extraction of places (mainly hotels) from external services. Google shows these places when searching in certain locations. They are however not regular places with pins on the map and offer only partial data. These places are marked with 3 extra output fields:
isExternalServicePlace
,externalServiceProvider
(e.g. Tripadvisor.com, Booking.com), andexternalId
.
Changes
deeperCityScrape
now scans all cities from the database, not just those over 10,000 population.
Fixes
deeperCityScrape
now creates circle polygons for cities with no polygon data in the database and does that also when the city area includes too large regions. This results in better performance and more places found.
Notes for future
- We plan to merge the default whole-country search with the
deeperCityScrape
option in the next release (likely next week). This should bring the best of both worlds and scan the cities deeper while also traversing the countryside, just with a smaller zoom. This will result in higher compute usage for the same searches (while finding more results) but we will also enable adjusting the zoom level up or down to control the coverage and cost better.
2023-06-02 (0.14.210)
Features
- Add
deeperCityScrape
input option to extract more places from larger areas like countries and regions. See the readme and input for a detailed explanation. This option will become the default behavior in the future.
2023-06-01 (0.14.209)
Fixes
- Skip places that are imported from external services like Tripadvisor on search URLs like this. These places don't contain
placeId
and have generally different data format. Eventually, we plan to support them. For now, it seems like a rare occurrence.
2023-05-30 (0.14.208)
Fixes
- Properly skip non-existent places (in case the user provides a wrong URL).
- Don't auto-resurrect runs out of memory if they started with lower memory than 512MB or failed too often (this means the user needs to increase the memory before starting again).
2023-05-25 (0.14.206)
Fixes
- Searches that redirect to a single place work again (this broke in the last update)
- Properly handle search that doesn't find anything (Google changed the design of the page)
- If the scraper reaches
maxCrawledPlacesPerSearch
or gets too many empty/redirect searches in a row (currently 20), it will quickly remove the rest of the searches for the same term from the queue. Previously it would blindly keep trying to scrape something there wasting resources. - Automatically resurrect the run if it hits Out of memory error. This started happening recently so we are still investigating the root cause.
- Shuffle the starting map locations to reduce the number of duplicates or missing queries at the start.
- Respect
maxCrawledPlacesPerSearch
whenonlyDataFromSearchPage
is used.
Features
- On every failed run, the actor triggers another actor
lukaskrivka/actor-fail-manager
via webhook.- It analyses the error and if appropriate it resurrects the run (e.g. in case of Out of memory error)
- It sends a report to the author to be able to promptly fix the issue or improve user experience if needed
2023-05-22
Fixes
- Retry pages with
allPlacesNoSearchAction
that produce captchas (can happen at very high scraping speed) - Rotate browsers more often with
allPlacesNoSearchAction
to prevent captchas. This slows down the scraping slightly. - Better validate
customeGeolocation
longitude and latitude order. Fail if out of bounds and add a warning if the values look unreasonable (inside the ocean).
Changes
- Add a new option
onlyDataFromSearchPage
(replacesexportPlaceUrls
) which allows the extraction of some of the data from the search page without going to the place's detail page. - Deprecate
exportPlaceUrls
input option. As usual, we keep it backward compatible for a long-time. - Each output place item can contain a maximum of 5000 reviews so in case there are more reviews for that place, a duplicate place is stored with the next 5000 reviews and so on. E.g. in the case of 50,000 reviews, the resulting dataset will have 10 items with the same place. This limitation is due to the size limit of a single item in the Apify dataset.
- Deprecate
"allPlacesNoSearchAction": "all_places_no_search_mouse"
input option as it was extremely slow. It now automatically fallbacks toall_places_no_search_ocr
which is the only option now.
2023-05-02
Hotfixes
- Scrolling in search and images didn't work at all for a while because the panel was moved.
- Remove invalid image URLs
2023-04-15
Changes
- Removed
proxyConfiguration
from input schema. This scraper works well with default datacenter proxies and changing it was causing issues. For special cases, proxy can stil be passed inproxyConfiguration
field in JSON input.
2023-04-11
BREAKING CHANGES
- Adjusting automatic zooming to set lower zoom for very small areas. Users rightfully complained that such high zoom produces too inefficient scrapes. The zoom curve is now flattened which means slighlty higher zoom for larger areas and significantly lower zoom for small areas. The highest automatic zoom is now capped at 17. The new example values are:
- United States - 10 zoom (10,371,139 km2)
- Germany - 12 zoom (380,878 km2)
- London - 15 zoom (1,595 km2)
- Manhattan - 16 zoon (87.5 km2)
- Soho - 17 zoom (0.35 km2)
2023-04-10
Features
- Correctly implement full geoJson specification for
customGeolocation
, you can now provide any validtype
. - Cache geolocation resolutions in global KV store to speed up the start and lessen dependency on OpenStreetMap API.
2023-04-06
BREAKING CHANGES & fixes
- Fixed and changed reviews translation after Google changed it. Now the
text
field contains the original text andtextTranslated
contains the translated text. - Due to this change,
reviewsTranslation
input setting is no longer required and was removed and we include both if available.
Features
- Added
reviewContext
andreviewDetailedRating
to all reviews where available. Examples in readme.
2023-04-04
BREAKING CHANGES
adjustZoomDynamically
is now used for all geolocation input types!locationQuery
is now the prominent location input with prefilled value
Fixes
customGeolocation
applies correct zoom again (this broke during the last release)
2023-03-29
BREAKING CHANGES
- Added
adjustZoomDynamically
input option. This changes the zoom from constant table based on geolocation type (country = zoom 12, city = zoom 15, etc.) into a calculated value based on area of the found location. Realistically, this means that very big countries might have 1-2 smaller zoom while very small areas might have 2-5 higher zoom to get more detailed scrape. Below are some examples from the new calculation:
- Minimum zoom from this is 19
- United States - 9 zoom (10,371,139 km2)
- Germany - 11 zoom (380,878 km2)
- London - 16 zoom (1,595 km2)
- Manhattan - 17 zoon (87.5 km2)
- Soho - 19 zoom (0.35 km2)
- Set
adjustZoomDynamically
to true forcustomGeolocation
.
We plan to make this the default zoom setting in the near future.
2023-03-22
Features
- Added support for shortened URLs (e.g.
https://goo.gl/maps/...
)
Fixes
- Extract
permanentlyClosed
value from JSON data - Fixed rare problem that sometimes after migration, not all URLs were processed
2023-03-15
Fixes
menu
output field is now extracted correctly again. The whole URL to the menu is provided now.
2023-03-14
Features
- Added
webResults
field to output. You have to enable that in input withincludeWebResults
field. There is a small performance impact when this is enabled.
2023-03-13
Fixes
- Large and low density countries like
Russia
andCanada
are now scraped with lower zoom to make the scrape more efficient. This applies only if whole country should be scraped.
2023-03-09
Features
- Add
locationQuery
input field. This can be used instead ofcountry
,state
,city
, etc. if those are not matching. This is mostly useful for very small states or regions. But it can also be used for free text description of the location. - Support
Dominica
country
2023-02-22
Features
- Add
reviewerPhotoUrl
andreviewImageUrls
field to review output
2023-02-22
Features
- Add
similarHotelsNearby
field to output.
2023-02-16
Fixes
- Updated
price
anddescription
extraction to support more languages.
2023-02-15
Features
- Improved
reviews
extraction, it's now faster and can extract more reviews.
2023-01-25
Fixes
- Fixed issue with
temporarilyClosed
field not being extracted properly in some cases.
Features
- Added
updatesFromCustomers
field to output.
1"updatesFromCustomers": { 2 "text": "Disneyland California Adventure small area with large park all inclusive celebrations. This is a glimpse into Los Reyes parade. I'm a true fan. Thanks", 3 "language": "en", 4 "postDate": "a week ago", 5 "postedBy": { 6 "name": "Kayla Arredondo", 7 "url": "https://www.google.com/maps/contrib/102968882116587973980?hl=en-US", 8 "title": "Local Guide", 9 "totalReviews": 225 10 }, 11 "media": [ 12 { 13 "link": "https://lh3.googleusercontent.com/ggms/AF1QipNNaoT0NSbcWOPSduvZNqJ0kSqUs-dod32FeBtr=m18", 14 "postTime": "a week ago" 15 } 16 ] 17}
- Added
questionsAndAnswers
field to output.
1"questionsAndAnswers": { 2 "question": "Which is the best easier way to drop off a family to Disneyland Park", 3 "answer": "best way for drop off family is at down town Disney. Drop them off then you can take a short walk to the park. ", 4 "askDate": "5 years ago", 5 "askedBy": { 6 "name": "Cecilia Salcedo", 7 "url": "https://www.google.com/maps/contrib/109041536347893604294" 8 }, 9 "answerDate": "5 years ago", 10 "answeredBy": { 11 "name": "Gabby Lujan", 12 "url": "https://www.google.com/maps/contrib/105966144333216697667" 13 } 14}
2023-01-24
Fixes
- Fixed extracting
reserveTableUrl
extraction for restaurants.
Features
- Add
reviewsFilterString
to input that enables you to filter reviews by search string. - Add
googleFoodUrl
field to output.
2023-01-24
Fixes
- Fixed place URL normalization sometimes not working. All place detail URL formats should work now, please open an issue if you find one that doesn't.
2023-01-24
Fixes
- Fixed and reworked
peopleAlsoSearch
. It is now in this format, more fields will be added to it:
1"peopleAlsoSearch": [ 2 { 3 "category": "Czech restaurants", 4 "title": "Restaurant Mlýnec", 5 "reviewsCount": 2561, 6 "totalScore": 4.7 7 } 8 ]
Changes
popularTimesHistogram
,openingHours
,additionalInfo
andpeopleAlsoSearch
are now added to the data all the time. This meansincludeHistogram
,includeOpeningHours
,additionalInfo
andincludePeopleAlsoSearch
input fields no longer have any effect.- To exclude these from data on Apify platform, use the
omit
URL parameter (e.g. add to dataset URL&omit=popularTimesHistogram,openingHours,additionalInfo,peopleAlsoSearch
). This can also be chosen in the export UI.
2023-01-12
Features
- Add
reserveTableUrl
field to output for restaurants. - Add
reviewsTags
andplacesTags
fields to output.
2023-01-13
BREAKING CHANGE
- opening hours
- remove trailing "," after day
- always start with Monday (but only for English language)
2023-01-12
Features
- Add
description
field to output. - Now we scrape hotel prices, and add the selected
checkInDate
andcheckOutDate
fields to the output (The price for hotels is based on these dates). - If the place is a hotel, add
moreHotelsOptions
field to output.
2023-01-10
Changes
- The crawler now sets default maximum concurrency based on provided memory GBs. Currently, this is set to 4 times memory, so 4 GB actor will stop scaling up at 16 concurrency. This should prevent the crawler to overscale with network timeouts. You can still override this value with
maxConcurrency
input field. - The crawler sets starting concurrency at half the memory GBs, this is just improvement to help it start faster.
- Slowed down upscaling to make the crawling smoother and reduce timeouts.
2023-01-09
Fixes
- Gas price updateAt field is extracted correctly again (before this fix all dates were from 1970).
2023-01-02
Fixes
- All tiny countries (and states) now work properly (some only if used without other geolocation parameters like city).
2022-12-22
Features
- Add
searchMatching
to input that enables you to specify how the search term should match the place name.
2022-12-16
Fixes
- Some countries like
Korea
,Tanzania
andCongo
were not found by the scraper.
2022-12-06
Features
- Added
hotelStars
to output (example value "5-star hotel").
2022-11-22
Changes (to simplify input)
- Removed
lat
andlng
input fields from input schema but it will keep working as it is passed in input. Prefer using geolocation options likecity
orcountry
instead. You can also still use it in direct URLs. - Removed
maxAutomaticZoomOut
input field from input schema. It will also keep working as it is.
Features
- Added
claimThisBusiness
to output.
2022-11-21
Fixes
- Fixed wrong location assigned to some smaller countries.
2022-11-10
Features
- Added
imagesCount
to output. It is displayed even if you don't extract their URLs.
2022-09-23
Fixes
- BREAKING CHANGE: Removed
maxCrawledPlaces
from input completely (usemaxCrawledPlacesPerSearch
instead) - Fixed
maxCrawledPlacesPerSearch
leading to scraper being hang out in some cases
2022-09-06
Fixes
- Fixes unstable image extraction
2022-09-05
Fixes
- Final round of optimizations and fixes of the search process. The scraper is now probably the fastest is has ever been finally reaching about 100 places per 1 compute unit even with using geolocation.
2022-09-02
Fixes
- Several optimizations to speed up the search page (scrolling & enqueueing places)
- Fixed extraction of images
2022-08-16
Fixes
- Improve extraction of additional infos for hotels.
2022-08-15
Fixes
- Fixed actor sometimes finishing prematurely when there were still requests in the queue (caused by the new background enqueueing system)
2022-08-05
Fixes
- Fixed reviews duplications that sometimes happened.
- Fixed extraction of the temporarilyClosed field.
2022-08-03
Fixes
- Fixed reviews extraction. After Google's change, the scraper was giving only up to 10 reviews. Now it works fully again.
newest
doesn't sort properly though yet.
2022-07-21
Fixes
- Finish fast when less than 120 places are found on a page. Previous implementation waited several seconds extra.
2022-07-20
Fixes
- Search pages now use scrolling instead of pagination. This makes the crawling a little slower and reduces the maximum number of places per page from 400 to 120. Use geolocation with zoom to work around this reduction. We might increase the default zoom by 1. in the near future.
2022-05-19
Features
- Added
gasPrices
to output. Available only for gas stations in US to the best of our knowledge.
2022-05-02
Fixes
- subTitle extraction works now
2022-04-04
Fixes
- Blocked responses on the search page now properly retry the request (no more unhandled promise rejection)
- Smoother search page pagination
- More informative logs
- Fixed consent approval if browser crashes
2022-03-16
Fixes
maxCrawledPlaces
+exportPlaceUrls
was giving inconsistent number of results.
2022-03-14
Features
- Added
allPlacesNoSearch
to input. This option allows you to scrape all places shown on the map without the need for any search term. - Added
reviewsStartDate
to input to extract only reviews newer than this date. - Added
radiusKm
to thePoint
type incustomGeolocation
2022-03-04
Improvement
additionalInfo
extraction is faster now.additionalInfo
extraction for hotels and similar categories is more complete now: Data which is not displayed on the Google page but present in the Google response is also extracted.
2022-03-03
- Lowering the default zoom values. The past setup made the scraping too slow and costly. The new defaults will speed up the scraping a lot while missing only a few places. You can still manually override the
zoom
parameter. New default values are:country
orstate
-> 12county
-> 14city
-> 15postalCode
-> 16 no geolocation -> 12
2022-02-28
Fixes
location
extraction works in (almost) all cases now (search URLs and URLs with place IDs will always work).
2022-02-21
Features
- Added
oneReviewPerRow
to input to enable expanding reviews one per output row
2022-02-17
Fixes
openingHours
extraction works in almost all cases now (search URLs and URLs with place IDs will always work).
2022-01-12
- Start URLs now correctly work from uploaded CSV files or Google Sheets. It uses to trim part of the URL.
2022-01-11
- Changed
polygon
input field tocustomGeolocation
- Added deeper section into Reamde on how you can provide your own exact coordinates
2022-01-11
Breaking changes We decided it is time to change several default parameters to make the user experience smoother. These changes should not have a big effect on currect users.
city
and other geolocation parameters will have preference overlat
&long
if both are used (in 99% cases users want to use the automatic location splitting to get the most results which doesn't work with directlat
&long
)zoom
will no longer have a default value 12. Instead, it will change based on geolocation type like this:
country
or state
-> 12
county
-> 14
city
-> 17
postalCode
-> 18
no geolocation -> 12
Users will still be able to specify the zoom and override this behavior.
See Readme for more details
2021-12-14
Breaking change
reviewsSort
is now set tonewest
by default. This is because some places don't yield all reviews on other sortings (we are not sure if this is a bug or silent block on Google's side)
2021-11-15
Fixes
exportPlaceUrls
now properly dedupes the URLs- added
categories
fields listing all categories the place is listed in
2021-11-11
Fixes
- Fixed
additionalInfo
for hotels - Fixed
exportPlaceUrls
not checking for correct geolocation
2021-11-09
Fixes
website
field now displays the full URL. This fixes issue of blankfacebook.com
links.
2021-11-05
Fixes
- Fixed new layout of
additionalInfo
2021-11-03
Fixes
- Improved reliability of scraping place detail, reviews and images (improving scrolling and back button interaction)
2021-10-13
Features
- Added
menu
to output - Added
price
to output
2021-10-07
Fixes
- Fixed
popularTimesHistogram
which caused crash on some pages
2021-09-27
Fixes
- Fixed image extraction & make it optional (it should not crash the whole scrape)
2021-09-15
Fixes
- Fixed
temporarilyClosed
andpermanentlyClosed
- Added a step for normalizing input Start URLs because those with wrong format don't contain JSON data
2021-09-14
Fixes
- Fixed popular times live and histogram
2021-09-10
https://github.com/drobnikj/crawler-google-places/pull/185 https://github.com/drobnikj/crawler-google-places/issues/181
Fixes
- In like 10% cases, the reviews are in wrong order and there is less of them. We didn't find a root cause yet but we retry the page so the output gets corrected.
2021-09-07
Breaking fix
- If you did not pass
maxReviews
in the input at all (undefined
), it scraped 5 reviews as default. That was against the input schema description so it is now fixed to scrape 0 reviews in those cases.
2021-09-01
Fixes
- Fixed
placeId
extraction that was broken for some inputs - Fixed missing
imageUrls
Features
- Added option to input URLs with CID (Google My Business Listing ID) to start URLs, e.g. https://maps.google.com/?cid=12640514468890456789
- Added
cid
to output
2021-08-25
Fixes
- Fixed
maxCrawledPlaces
not finishing quickly for large country-wise searches.maxCrawledPlacesPerSearch
still has this problem
2021-08-12
Fixes
- Fixed problem that
startUrls
was not picking up all provided URLs sometimes (due to automaticuniqueKey
resolution) likesCount
in reviews
2021-08-06
Fixes
maxCrawledPlaces
now compares to total sum of all places
Features
- Added
maxCrawledPlacesPerSearch
to limit max places per search term or search URL
2021-07-26
Fixes
-
Address is now parsed correctly into components even when you supply direct place IDs
-
Migrated code from
apify
0.22.5 to 1.3.1
2021-07-13
- Added
county
to geolocation options
2021-06-03
Fixes (hopefully last fixes after the layout change)
- Scraping all images per place works again
- Fixed
additionalInfo
- Fixed
openiningHours
2021-06-03
Fixes
- Fix handling of search pages without results
- Skip empty searches that sometimes users accidentally post
2021-05-25
Features
- Added orderBy attribute to result scrape
2021-05-18
Fixes
- Fully or partially fixed consent screen issues
- Should also help with
Failed to set the 'innerHTML' property on 'Element': This document requires 'TrustedHTML' assignment.
which is caused by injecting JQuery into constent screen
2021-04-29
Fixes
- Fixed
reviewsTranslation
2021-04-28
Fixes after Google changed layout, not everything was fixed. Next batch of fixed asap!
- Fixed additional data
- Fixed search pagination getting into infinite loop
- Fixed empty search handling
- Fixed reviews not being scraped
- Fixed
totalScore
2021-03-22
Warning - Next version will be a breaking one as we will remove personal data from reviews by default. You will have to explicitly enable the fields below. Features
- Added input fields to selectively pick which personal data fields to scrape -
scrapeReviewerName
,scrapeReviewerId
,scrapeReviewerUrl
,scrapeReviewId
,scrapeReviewUrl
,scrapeResponseFromOwnerText
2021-03-17
Fixes
- Removed duplicate reviews + all reviews scraped correctly
reviewsSort
finally works correctly- Reviews scraping is now significantly faster
- Handle error that irregularly happened when scraping huge amount of reviews
Features
- Added
reviewsDistribution
- Added
publishedAtDate
(exact date),responseFromOwnerDate
andresponseFromOwnerText
for each review
2021-03-10
Fixes:
totalScore
andreviewsCount
are now correctly extracted for all languagesstartUrls
now correctly work non-.com domains and on detail places
2021-02-02
Fixes:
- Search keyword that links only to a single place (like
"London Eye"
) now works correctly
2021-01-27
Features:
- Address is parsed into
neighborhood
,street
,city
,postalCode
,state
andcountryCode
fields - Added
reviewsTranslation
option to adjust how Google translates reviews from non-English languages - Parsing ads. This means a bit more results. Those that are ads have
"isAdvertisement": true
field. - Added
useCachedPlaces
option to load places from your KV Store. Useful if you need to scrape the same places regularly. - Added
polygon
option to provide your own geolocation polygon.
Fixes:
- This one is big. We removed the infamous
Place is outside of required location (polygon)
error. The location of a place is now checked during paginating and these places are skipped. This means a massive speed up of the scraper.
2021-01-11
Features:
- Automatic screenshots of errors to see what went wrong
- Added
searchPageUrl
to output - Added
PLACES-OUT-OF-POLYOGON
record to Key-Value store. You can check what places were excluded.
Fixes:
- Fixed rare bug with saving stats
- Improvement in review sorting - but it is still not ideal, more work needs to be done
2020-11-16
- Added postal code geolocation to input
- Improved errors when location is not found
- Optimization - Removed geolocation data from intermediate requests
2020-10-29
- Fixed handling of Google consent screen
- Better input validation and deprecation logs
- Changed default for
maxImages
to1
as it doesn't require scrolling for the main image imageUrls
are returned with the highest resolution
2020-10-27
- Removed
forceEng
input in favor oflanguage
2020-10-15
- The default setup now uses
maxImages: 0
andmaxReviews: 0
to improve efficiency
2020-10-01
- added several browser options to input -
maxConcurrency
,maxPageRetries
,pageLoadTimeoutSec
,maxPagesPerBrowser
,useChrome
- rewamped input schema and readme
- Added
reviewerNumberOfReviews
andisLocalGuide
to reviews
2020-09-22
- added few extra review fields (ID, URL)
2020-07-23 small features
New features
- add an option for caching place location
- add an option for sorting of reviews
- add stats logging
2020-07 polygon search and bug fixes
breaking change
- reworked input search string
Bug fixes
- opening hour parsing (#39)
- separate locatedIn field (#32)
- update readme
New features
- extract additional info - Service Options, Highlights, Offerings,.. (#41)
- add
maxReviews
,maxImages
(#40) - add
temporarilyClosed
andpermanentlyClosed
flags (#33) - allow to scrape only places urls (#29)
- add
forceEnglish
flag into input (#24, #21) - add searching in polygon using nominatim.org
- add startUrls
- added
maxAutomaticZoomOut
to limit how far can Google zoom out (it naturally zooms out as you press next page in search)
Actor Metrics
701 monthly users
-
77 stars
99% runs succeeded
4 days response time
Created in Feb 2022
Modified a day ago