Google Maps Scraper avatar
Google Maps Scraper
Try for free

No credit card required

View all Actors
Google Maps Scraper

Google Maps Scraper

compass/crawler-google-places
Try for free

No credit card required

Extract data from hundreds of Google Maps locations and businesses. Get Google Maps data including reviews, images, contact info, opening hours, location, popular times, prices & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

2024-04-05 (0.14.262)

Features

  • You can scrape more questionsAndAnswers using a new maxQuestions input parameter.

Improvements

  • Improve capture rate of search using Google's "Search this area" option. This reveals lesser known places that Google doesn't show in normal searches. On average we see about 5-10% more places for the same area but for some rare search terms it can lead to up to 50% improvement.

Fixes

  • Scrape streetview image if a place has no images

BREAKING CHANGE

  • questionsAndAnswers is now an array, instead of an object. We try to avoid breaking changes where possible but since this questionsAndAnswers output field is not used often we decided to prioritize data cleanliness.

2024-03-13

Fixes

  • Fixed reviews text extraction

2024-03-08 (0.14.261)

Features

  • Added reviewOrigin to reviews - whether the review comes from Google, Tripadvisor or else.

2024-02-15 (0.14.257)

Fixes

  • Fixed reviews sorting after Google changed API

Improvements

  • Removed any browser interaction when scraping reviews to avoid timeouts. This should make the reviews scraping more stable and slightly faster.

2024-01-29 (0.14.252)

Fixes

  • Fixed image scraping
  • Fixed processing large amounts of search terms

2024-01-27 (0.14.251)

Fixes

  • Fixed placeMinimumStars although Google completely removed this search option, the scraper now post-filters the places. This option might be deprecated in the future.

2024-01-09 (0.14.249)

Features

  • Added more places categories

2023-12-04 (0.14.246)

Breaking change

  • Set minimum allowed memory to 512MB. The actor doesn't work well with less memory and runs were often crashing due to that.

Features

  • Added a new visitedIn field to the review data in the output. This field provides the date or time period of the reviewer's visit to the place

2023-12-04 (0.14.245)

Fixes

  • Fixed issue where scraping images timeouts if place has no images

Features

  • Add support for custom place list URLs

2023-11-23 (0.14.244)

Features

  • Added scrapeDirectories option to scrape places' directories (e.g. places inside malls)

2023-11-02 (0.14.243)

Fixes

  • neighborhood should contain correct value
  • Avoid wrong price values

2023-10-26 (0.14.242)

Fixes

  • Fix actor sometimes crashing completely. This was caused by previous release, you can resurrect the failed runs now to continue where it left off.

2023-10-25 (0.14.241)

Fixes

  • Fix scraping locations that don't have an address
  • Increase capture rate for "All places no search" option by about 10% by increasing the browser viewport

2023-10-23 (0.14.239)

Fixes

  • Fix search page scrolling after Google changed layout. This bug appeared today and caused that only 20 first places were captured on each search page

2023-10-18 (0.14.237)

Fixes

  • Properly handle places with no web results

2023-10-18 (0.14.236)

Fixes

  • Fixed geolocation: country resolution

Features

  • Added hotelAds field to output

2023-10-12 (0.14.232)

Features

  • Allow visualizing all scraped places in the "Live view" tab, allowing for real-time updates of the scraped places as the scraper runs.

2023-10-11 (0.14.231)

Fixes

  • Fixed reviews sorting after API change

2023-10-09 (0.14.230)

Fixes

  • Fixed reviews extraction that broke today after Google completely revamped their API. The new reviews scraping is a bit slower because it is paginating every 20 reviews.

2023-10-06 (0.14.229)

Features

  • Added categoryFilterWords to select & limit what exact Google categories should be scraped. You can also use the categories instead of search terms.

2023-09-21 (0.14.228)

Fixes

  • Update selector for displayedUrl in web results

Features

  • Released a separate Google Maps Scraper Orchestrator actor. It allows passing a list of locations and returns deduplicated results collected from multiple Google Maps runs. By running multiple runs in the background, it can utilize all your Apify account memory for maximum speed scraping.

2023-09-06 (0.14.227)

Features

  • Added hotelReviewSummary field to output

2023-08-21 (0.14.226)

Fixes

  • Fixed reviewsFilterString after Google redesign
  • Fixed orderBy for some specific place/language combinations
  • Improved subTitle extraction by extracting also from JSON data (Google uses this field for various meanings)
  • Fix formatting of peopleAlsoSearch if empty (it is empty array, used to by array of null values)

Features

  • Added imageCategories field to output

2023-08-08 (0.14.225)

Fixes

  • Fixed orderBy extraction
  • Skip search terms that redirect to a 'directions' type of page. This can happen for very long search terms that Google doesn't recognize as a place.

Features

  • Added skipClosedPlaces to input to skip permanently or temporarily closed places.

Changes

  • Scrape places also from partially matched search results (they include a section with Partial matches or Don't see what you're looking for?). These can be the places you want but Google is not sure about it. Usually, it has only about 2 places in the results. You can see a warning message in the log if the match was only partial.

2023-08-01 (0.14.223)

Fixes

  • Resolved issue with reviews and image scraping not working properly on some places after Google design change
  • Retry "bad query" search page result a few times because it can sometimes work only with specific proxy

2023-07-20 (0.14.222)

Fixes

  • Improve handling of searches that redirect to a single place. It is faster and more reliable now. Previously, the scraper retriggered the search which in rare cases might have loaded a different place.

2023-07-19 (0.14.219)

Hotfixes

  • Fixed changed layout for the heading of the detail page that completely broke the scraper for a moment

Features

  • Visualized results map is now auto updating with newly scraped places after being opened
  • allPlacesNoSearchAction can now be used in combination with startUrls if they don't contain a search term. This allows scraping only a single map page for all places (no search term).

2023-06-27 (0.14.217)

Features

  • Added placeMinimumStars input option to only include places that have an average rating higher than the provided input.

Fixes

  • Correctly handle places with no location (no lat/lon). We include these places because they are usually relevant to the search.
  • Don't cut off an extra review if it is added as we scrape the place or if Google has a wrong counter on the total reviews

2023-06-23 (0.14.215)

Fixes

  • Load data into the results map correctly for large datasets (was getting rate limited because of too fast loading).

2023-06-23 (0.14.213)

Features

  • Added an HTML page of a map that nicely visualizes all scraped places. It is stored in a Key-Value Store of each run. Currently, you need to reload the page if you want to display new places (if the scraper is still running) Auto-refreshing will come soon.
  • Added extraction of places (mainly hotels) from external services. Google shows these places when searching in certain locations. They are however not regular places with pins on the map and offer only partial data. These places are marked with 3 extra output fields: isExternalServicePlace, externalServiceProvider (e.g. Tripadvisor.com, Booking.com), and externalId.

Changes

  • deeperCityScrape now scans all cities from the database, not just those over 10,000 population.

Fixes

  • deeperCityScrape now creates circle polygons for cities with no polygon data in the database and does that also when the city area includes too large regions. This results in better performance and more places found.

Notes for future

  • We plan to merge the default whole-country search with the deeperCityScrape option in the next release (likely next week). This should bring the best of both worlds and scan the cities deeper while also traversing the countryside, just with a smaller zoom. This will result in higher compute usage for the same searches (while finding more results) but we will also enable adjusting the zoom level up or down to control the coverage and cost better.

2023-06-02 (0.14.210)

Features

  • Add deeperCityScrape input option to extract more places from larger areas like countries and regions. See the readme and input for a detailed explanation. This option will become the default behavior in the future.

2023-06-01 (0.14.209)

Fixes

  • Skip places that are imported from external services like Tripadvisor on search URLs like this. These places don't contain placeId and have generally different data format. Eventually, we plan to support them. For now, it seems like a rare occurrence.

2023-05-30 (0.14.208)

Fixes

  • Properly skip non-existent places (in case the user provides a wrong URL).
  • Don't auto-resurrect runs out of memory if they started with lower memory than 512MB or failed too often (this means the user needs to increase the memory before starting again).

2023-05-25 (0.14.206)

Fixes

  • Searches that redirect to a single place work again (this broke in the last update)
  • Properly handle search that doesn't find anything (Google changed the design of the page)
  • If the scraper reaches maxCrawledPlacesPerSearch or gets too many empty/redirect searches in a row (currently 20), it will quickly remove the rest of the searches for the same term from the queue. Previously it would blindly keep trying to scrape something there wasting resources.
  • Automatically resurrect the run if it hits Out of memory error. This started happening recently so we are still investigating the root cause.
  • Shuffle the starting map locations to reduce the number of duplicates or missing queries at the start.
  • Respect maxCrawledPlacesPerSearch when onlyDataFromSearchPage is used.

Features

  • On every failed run, the actor triggers another actor lukaskrivka/actor-fail-manager via webhook.
    • It analyses the error and if appropriate it resurrects the run (e.g. in case of Out of memory error)
    • It sends a report to the author to be able to promptly fix the issue or improve user experience if needed

2023-05-22

Fixes

  • Retry pages with allPlacesNoSearchAction that produce captchas (can happen at very high scraping speed)
  • Rotate browsers more often with allPlacesNoSearchAction to prevent captchas. This slows down the scraping slightly.
  • Better validate customeGeolocation longitude and latitude order. Fail if out of bounds and add a warning if the values look unreasonable (inside the ocean).

Changes

  • Add a new option onlyDataFromSearchPage (replaces exportPlaceUrls) which allows the extraction of some of the data from the search page without going to the place's detail page.
  • Deprecate exportPlaceUrls input option. As usual, we keep it backward compatible for a long-time.
  • Each output place item can contain a maximum of 5000 reviews so in case there are more reviews for that place, a duplicate place is stored with the next 5000 reviews and so on. E.g. in the case of 50,000 reviews, the resulting dataset will have 10 items with the same place. This limitation is due to the size limit of a single item in the Apify dataset.
  • Deprecate "allPlacesNoSearchAction": "all_places_no_search_mouse" input option as it was extremely slow. It now automatically fallbacks to all_places_no_search_ocr which is the only option now.

2023-05-02

Hotfixes

  • Scrolling in search and images didn't work at all for a while because the panel was moved.
  • Remove invalid image URLs

2023-04-15

Changes

  • Removed proxyConfiguration from input schema. This scraper works well with default datacenter proxies and changing it was causing issues. For special cases, proxy can stil be passed in proxyConfiguration field in JSON input.

2023-04-11

BREAKING CHANGES

  • Adjusting automatic zooming to set lower zoom for very small areas. Users rightfully complained that such high zoom produces too inefficient scrapes. The zoom curve is now flattened which means slighlty higher zoom for larger areas and significantly lower zoom for small areas. The highest automatic zoom is now capped at 17. The new example values are:
  • United States - 10 zoom (10,371,139 km2)
  • Germany - 12 zoom (380,878 km2)
  • London - 15 zoom (1,595 km2)
  • Manhattan - 16 zoon (87.5 km2)
  • Soho - 17 zoom (0.35 km2)

2023-04-10

Features

  • Correctly implement full geoJson specification for customGeolocation, you can now provide any valid type.
  • Cache geolocation resolutions in global KV store to speed up the start and lessen dependency on OpenStreetMap API.

2023-04-06

BREAKING CHANGES & fixes

  • Fixed and changed reviews translation after Google changed it. Now the text field contains the original text and textTranslated contains the translated text.
  • Due to this change, reviewsTranslation input setting is no longer required and was removed and we include both if available.

Features

  • Added reviewContext and reviewDetailedRating to all reviews where available. Examples in readme.

2023-04-04

BREAKING CHANGES

  • adjustZoomDynamically is now used for all geolocation input types!
  • locationQuery is now the prominent location input with prefilled value

Fixes

  • customGeolocation applies correct zoom again (this broke during the last release)

2023-03-29

BREAKING CHANGES

  • Added adjustZoomDynamically input option. This changes the zoom from constant table based on geolocation type (country = zoom 12, city = zoom 15, etc.) into a calculated value based on area of the found location. Realistically, this means that very big countries might have 1-2 smaller zoom while very small areas might have 2-5 higher zoom to get more detailed scrape. Below are some examples from the new calculation:
  • Minimum zoom from this is 19
  • United States - 9 zoom (10,371,139 km2)
  • Germany - 11 zoom (380,878 km2)
  • London - 16 zoom (1,595 km2)
  • Manhattan - 17 zoon (87.5 km2)
  • Soho - 19 zoom (0.35 km2)
  • Set adjustZoomDynamically to true for customGeolocation.

We plan to make this the default zoom setting in the near future.

2023-03-22

Features

  • Added support for shortened URLs (e.g. https://goo.gl/maps/...)

Fixes

  • Extract permanentlyClosed value from JSON data
  • Fixed rare problem that sometimes after migration, not all URLs were processed

2023-03-15

Fixes

  • menu output field is now extracted correctly again. The whole URL to the menu is provided now.

2023-03-14

Features

  • Added webResults field to output. You have to enable that in input with includeWebResults field. There is a small performance impact when this is enabled.

2023-03-13

Fixes

  • Large and low density countries like Russia and Canada are now scraped with lower zoom to make the scrape more efficient. This applies only if whole country should be scraped.

2023-03-09

Features

  • Add locationQuery input field. This can be used instead of country, state, city, etc. if those are not matching. This is mostly useful for very small states or regions. But it can also be used for free text description of the location.
  • Support Dominica country

2023-02-22

Features

  • Add reviewerPhotoUrl and reviewImageUrls field to review output

2023-02-22

Features

  • Add similarHotelsNearby field to output.

2023-02-16

Fixes

  • Updated price and description extraction to support more languages.

2023-02-15

Features

  • Improved reviews extraction, it's now faster and can extract more reviews.

2023-01-25

Fixes

  • Fixed issue with temporarilyClosed field not being extracted properly in some cases.

Features

  • Added updatesFromCustomers field to output.
1"updatesFromCustomers": {
2    "text": "Disneyland California Adventure small area with large park all inclusive celebrations. This is a glimpse into Los Reyes parade.  I'm a true fan. Thanks",
3    "language": "en",
4    "postDate": "a week ago",
5    "postedBy": {
6        "name": "Kayla Arredondo",
7        "url": "https://www.google.com/maps/contrib/102968882116587973980?hl=en-US",
8        "title": "Local Guide",
9        "totalReviews": 225
10    },
11    "media": [
12        {
13            "link": "https://lh3.googleusercontent.com/ggms/AF1QipNNaoT0NSbcWOPSduvZNqJ0kSqUs-dod32FeBtr=m18",
14            "postTime": "a week ago"
15        }
16    ]
17}
  • Added questionsAndAnswers field to output.
1"questionsAndAnswers": {
2    "question": "Which is the best easier way to drop off a family to Disneyland Park",
3    "answer": "best way for drop off family is at down town Disney. Drop them off then you can take a short walk to the park. ",
4    "askDate": "5 years ago",
5    "askedBy": {
6        "name": "Cecilia Salcedo",
7        "url": "https://www.google.com/maps/contrib/109041536347893604294"
8    },
9    "answerDate": "5 years ago",
10    "answeredBy": {
11        "name": "Gabby Lujan",
12        "url": "https://www.google.com/maps/contrib/105966144333216697667"
13    }
14}

2023-01-24

Fixes

  • Fixed extracting reserveTableUrl extraction for restaurants.

Features

  • Add reviewsFilterString to input that enables you to filter reviews by search string.
  • Add googleFoodUrl field to output.

2023-01-24

Fixes

  • Fixed place URL normalization sometimes not working. All place detail URL formats should work now, please open an issue if you find one that doesn't.

2023-01-24

Fixes

  • Fixed and reworked peopleAlsoSearch. It is now in this format, more fields will be added to it:
1"peopleAlsoSearch": [
2    {
3        "category": "Czech restaurants",
4        "title": "Restaurant Mlýnec",
5        "reviewsCount": 2561,
6        "totalScore": 4.7
7    }
8  ]

Changes

  • popularTimesHistogram, openingHours, additionalInfo and peopleAlsoSearch are now added to the data all the time. This means includeHistogram, includeOpeningHours, additionalInfo and includePeopleAlsoSearch input fields no longer have any effect.
  • To exclude these from data on Apify platform, use the omit URL parameter (e.g. add to dataset URL &omit=popularTimesHistogram,openingHours,additionalInfo,peopleAlsoSearch). This can also be chosen in the export UI.

2023-01-12

Features

  • Add reserveTableUrl field to output for restaurants.
  • Add reviewsTags and placesTags fields to output.

2023-01-13

BREAKING CHANGE

  • opening hours
    • remove trailing "," after day
    • always start with Monday (but only for English language)

2023-01-12

Features

  • Add description field to output.
  • Now we scrape hotel prices, and add the selected checkInDate and checkOutDate fields to the output (The price for hotels is based on these dates).
  • If the place is a hotel, add moreHotelsOptions field to output.

2023-01-10

Changes

  • The crawler now sets default maximum concurrency based on provided memory GBs. Currently, this is set to 4 times memory, so 4 GB actor will stop scaling up at 16 concurrency. This should prevent the crawler to overscale with network timeouts. You can still override this value with maxConcurrency input field.
  • The crawler sets starting concurrency at half the memory GBs, this is just improvement to help it start faster.
  • Slowed down upscaling to make the crawling smoother and reduce timeouts.

2023-01-09

Fixes

  • Gas price updateAt field is extracted correctly again (before this fix all dates were from 1970).

2023-01-02

Fixes

  • All tiny countries (and states) now work properly (some only if used without other geolocation parameters like city).

2022-12-22

Features

  • Add searchMatching to input that enables you to specify how the search term should match the place name.

2022-12-16

Fixes

  • Some countries like Korea, Tanzania and Congo were not found by the scraper.

2022-12-06

Features

  • Added hotelStars to output (example value "5-star hotel").

2022-11-22

Changes (to simplify input)

  • Removed lat and lng input fields from input schema but it will keep working as it is passed in input. Prefer using geolocation options like city or country instead. You can also still use it in direct URLs.
  • Removed maxAutomaticZoomOut input field from input schema. It will also keep working as it is.

Features

  • Added claimThisBusiness to output.

2022-11-21

Fixes

  • Fixed wrong location assigned to some smaller countries.

2022-11-10

Features

  • Added imagesCount to output. It is displayed even if you don't extract their URLs.

2022-09-23

Fixes

  • BREAKING CHANGE: Removed maxCrawledPlaces from input completely (use maxCrawledPlacesPerSearch instead)
  • Fixed maxCrawledPlacesPerSearch leading to scraper being hang out in some cases

2022-09-06

Fixes

  • Fixes unstable image extraction

2022-09-05

Fixes

  • Final round of optimizations and fixes of the search process. The scraper is now probably the fastest is has ever been finally reaching about 100 places per 1 compute unit even with using geolocation.

2022-09-02

Fixes

  • Several optimizations to speed up the search page (scrolling & enqueueing places)
  • Fixed extraction of images

2022-08-16

Fixes

  • Improve extraction of additional infos for hotels.

2022-08-15

Fixes

  • Fixed actor sometimes finishing prematurely when there were still requests in the queue (caused by the new background enqueueing system)

2022-08-05

Fixes

  • Fixed reviews duplications that sometimes happened.
  • Fixed extraction of the temporarilyClosed field.

2022-08-03

Fixes

  • Fixed reviews extraction. After Google's change, the scraper was giving only up to 10 reviews. Now it works fully again. newest doesn't sort properly though yet.

2022-07-21

Fixes

  • Finish fast when less than 120 places are found on a page. Previous implementation waited several seconds extra.

2022-07-20

Fixes

  • Search pages now use scrolling instead of pagination. This makes the crawling a little slower and reduces the maximum number of places per page from 400 to 120. Use geolocation with zoom to work around this reduction. We might increase the default zoom by 1. in the near future.

2022-05-19

Features

  • Added gasPrices to output. Available only for gas stations in US to the best of our knowledge.

2022-05-02

Fixes

  • subTitle extraction works now

2022-04-04

Fixes

  • Blocked responses on the search page now properly retry the request (no more unhandled promise rejection)
  • Smoother search page pagination
  • More informative logs
  • Fixed consent approval if browser crashes

2022-03-16

Fixes

  • maxCrawledPlaces + exportPlaceUrls was giving inconsistent number of results.

2022-03-14

Features

  • Added allPlacesNoSearch to input. This option allows you to scrape all places shown on the map without the need for any search term.
  • Added reviewsStartDate to input to extract only reviews newer than this date.
  • Added radiusKm to the Point type in customGeolocation

2022-03-04

Improvement

  • additionalInfo extraction is faster now.
  • additionalInfo extraction for hotels and similar categories is more complete now: Data which is not displayed on the Google page but present in the Google response is also extracted.

2022-03-03

  • Lowering the default zoom values. The past setup made the scraping too slow and costly. The new defaults will speed up the scraping a lot while missing only a few places. You can still manually override the zoom parameter. New default values are: country or state -> 12 county -> 14 city -> 15 postalCode -> 16 no geolocation -> 12

2022-02-28

Fixes

  • location extraction works in (almost) all cases now (search URLs and URLs with place IDs will always work).

2022-02-21

Features

  • Added oneReviewPerRow to input to enable expanding reviews one per output row

2022-02-17

Fixes

  • openingHours extraction works in almost all cases now (search URLs and URLs with place IDs will always work).

2022-01-12

  • Start URLs now correctly work from uploaded CSV files or Google Sheets. It uses to trim part of the URL.

2022-01-11

  • Changed polygon input field to customGeolocation
  • Added deeper section into Reamde on how you can provide your own exact coordinates

2022-01-11

Breaking changes We decided it is time to change several default parameters to make the user experience smoother. These changes should not have a big effect on currect users.

  • city and other geolocation parameters will have preference over lat & long if both are used (in 99% cases users want to use the automatic location splitting to get the most results which doesn't work with direct lat & long)
  • zoom will no longer have a default value 12. Instead, it will change based on geolocation type like this:

country or state -> 12 county -> 14 city -> 17 postalCode -> 18 no geolocation -> 12

Users will still be able to specify the zoom and override this behavior.

See Readme for more details

2021-12-14

Breaking change

  • reviewsSort is now set to newest by default. This is because some places don't yield all reviews on other sortings (we are not sure if this is a bug or silent block on Google's side)

2021-11-15

Fixes

  • exportPlaceUrls now properly dedupes the URLs
  • added categories fields listing all categories the place is listed in

2021-11-11

Fixes

  • Fixed additionalInfo for hotels
  • Fixed exportPlaceUrls not checking for correct geolocation

2021-11-09

Fixes

  • website field now displays the full URL. This fixes issue of blank facebook.com links.

2021-11-05

Fixes

  • Fixed new layout of additionalInfo

2021-11-03

Fixes

  • Improved reliability of scraping place detail, reviews and images (improving scrolling and back button interaction)

2021-10-13

Features

  • Added menu to output
  • Added price to output

2021-10-07

Fixes

  • Fixed popularTimesHistogram which caused crash on some pages

2021-09-27

Fixes

  • Fixed image extraction & make it optional (it should not crash the whole scrape)

2021-09-15

Fixes

  • Fixed temporarilyClosed and permanentlyClosed
  • Added a step for normalizing input Start URLs because those with wrong format don't contain JSON data

2021-09-14

Fixes

  • Fixed popular times live and histogram

2021-09-10

https://github.com/drobnikj/crawler-google-places/pull/185 https://github.com/drobnikj/crawler-google-places/issues/181

Fixes

  • In like 10% cases, the reviews are in wrong order and there is less of them. We didn't find a root cause yet but we retry the page so the output gets corrected.

2021-09-07

Breaking fix

  • If you did not pass maxReviews in the input at all (undefined), it scraped 5 reviews as default. That was against the input schema description so it is now fixed to scrape 0 reviews in those cases.

2021-09-01

Fixes

  • Fixed placeId extraction that was broken for some inputs
  • Fixed missing imageUrls

Features

2021-08-25

Fixes

  • Fixed maxCrawledPlaces not finishing quickly for large country-wise searches. maxCrawledPlacesPerSearch still has this problem

2021-08-12

Fixes

  • Fixed problem that startUrls was not picking up all provided URLs sometimes (due to automatic uniqueKey resolution)
  • likesCount in reviews

2021-08-06

Fixes

  • maxCrawledPlaces now compares to total sum of all places

Features

  • Added maxCrawledPlacesPerSearch to limit max places per search term or search URL

2021-07-26

Fixes

  • Address is now parsed correctly into components even when you supply direct place IDs

  • Migrated code from apify 0.22.5 to 1.3.1

2021-07-13

  • Added county to geolocation options

2021-06-03

Fixes (hopefully last fixes after the layout change)

  • Scraping all images per place works again
  • Fixed additionalInfo
  • Fixed openiningHours

2021-06-03

Fixes

  • Fix handling of search pages without results
  • Skip empty searches that sometimes users accidentally post

2021-05-25

Features

  • Added orderBy attribute to result scrape

2021-05-18

Fixes

  • Fully or partially fixed consent screen issues
  • Should also help with Failed to set the 'innerHTML' property on 'Element': This document requires 'TrustedHTML' assignment. which is caused by injecting JQuery into constent screen

2021-04-29

Fixes

  • Fixed reviewsTranslation

2021-04-28

Fixes after Google changed layout, not everything was fixed. Next batch of fixed asap!

  • Fixed additional data
  • Fixed search pagination getting into infinite loop
  • Fixed empty search handling
  • Fixed reviews not being scraped
  • Fixed totalScore

2021-03-22

Warning - Next version will be a breaking one as we will remove personal data from reviews by default. You will have to explicitly enable the fields below. Features

  • Added input fields to selectively pick which personal data fields to scrape - scrapeReviewerName, scrapeReviewerId, scrapeReviewerUrl, scrapeReviewId, scrapeReviewUrl, scrapeResponseFromOwnerText

2021-03-17

Fixes

  • Removed duplicate reviews + all reviews scraped correctly
  • reviewsSort finally works correctly
  • Reviews scraping is now significantly faster
  • Handle error that irregularly happened when scraping huge amount of reviews

Features

  • Added reviewsDistribution
  • Added publishedAtDate (exact date), responseFromOwnerDate and responseFromOwnerText for each review

2021-03-10

Fixes:

  • totalScore and reviewsCount are now correctly extracted for all languages
  • startUrls now correctly work non-.com domains and on detail places

2021-02-02

Fixes:

  • Search keyword that links only to a single place (like "London Eye") now works correctly

2021-01-27

Features:

  • Address is parsed into neighborhood, street, city, postalCode, state and countryCode fields
  • Added reviewsTranslation option to adjust how Google translates reviews from non-English languages
  • Parsing ads. This means a bit more results. Those that are ads have "isAdvertisement": true field.
  • Added useCachedPlaces option to load places from your KV Store. Useful if you need to scrape the same places regularly.
  • Added polygon option to provide your own geolocation polygon.

Fixes:

  • This one is big. We removed the infamous Place is outside of required location (polygon) error. The location of a place is now checked during paginating and these places are skipped. This means a massive speed up of the scraper.

2021-01-11

Features:

  • Automatic screenshots of errors to see what went wrong
  • Added searchPageUrl to output
  • Added PLACES-OUT-OF-POLYOGON record to Key-Value store. You can check what places were excluded.

Fixes:

  • Fixed rare bug with saving stats
  • Improvement in review sorting - but it is still not ideal, more work needs to be done

2020-11-16

  • Added postal code geolocation to input
  • Improved errors when location is not found
  • Optimization - Removed geolocation data from intermediate requests

2020-10-29

  • Fixed handling of Google consent screen
  • Better input validation and deprecation logs
  • Changed default for maxImages to 1 as it doesn't require scrolling for the main image
  • imageUrls are returned with the highest resolution

2020-10-27

  • Removed forceEng input in favor of language

2020-10-15

  • The default setup now uses maxImages: 0 and maxReviews: 0 to improve efficiency

2020-10-01

  • added several browser options to input - maxConcurrency, maxPageRetries, pageLoadTimeoutSec, maxPagesPerBrowser, useChrome
  • rewamped input schema and readme
  • Added reviewerNumberOfReviews and isLocalGuide to reviews

2020-09-22

  • added few extra review fields (ID, URL)

2020-07-23 small features

New features

  • add an option for caching place location
  • add an option for sorting of reviews
  • add stats logging

2020-07 polygon search and bug fixes

breaking change

  • reworked input search string

Bug fixes

  • opening hour parsing (#39)
  • separate locatedIn field (#32)
  • update readme

New features

  • extract additional info - Service Options, Highlights, Offerings,.. (#41)
  • add maxReviews, maxImages (#40)
  • add temporarilyClosed and permanentlyClosed flags (#33)
  • allow to scrape only places urls (#29)
  • add forceEnglish flag into input (#24, #21)
  • add searching in polygon using nominatim.org
  • add startUrls
  • added maxAutomaticZoomOut to limit how far can Google zoom out (it naturally zooms out as you press next page in search)
Developer
Maintained by Apify
Actor metrics
  • 4.2k monthly users
  • 96.4% runs succeeded
  • 2.1 days response time
  • Created in Nov 2018
  • Modified about 3 hours ago