Decode html entities (such as & → &) in place names
2024-04-25 (v0.0.64)
Other
Removed debugLog and maxRequestRetries input options
Removed scrapeReviewerName and scrapeReviewerUrl input options, and add this data to the output by default
2024-02-26 (v0.0.62)
Fixes
Adjust to new method of extracting reviews (the Yelp website changed and the old method stopped working)
2023-11-22 (v0.0.61)
Features
Added field ownerReplies to each review. This field is an array of objects {"text": "Review response text", "date": "1/1/2023"} for each response to the review.
2023-11-08 (v0.0.60)
Fixes
Only provide field cuisine for restaurants
Improved extraction of categories
2023-09-01 (v0.0.59)
Features
New field: aboutTheBusiness: extracts info about the business provided by the owner, specifically text about Specialties and History and year of establishment
New input option: debugLog (disabled by default)
Optimizations
If the scraper is started with reviewLimit: 0, the scraper will now completely skip making requests for the reviews page, saving a small bit of time and data transfer
2023-03-20
Features
Add field alternateNames to output: provides alternative names, especially useful for places in countries using non-latin characters.
2023-03-01
Features
Rewrite of the scraper to use the new SDK V3
2023-01-12
Features
Allow users to specify reviewsLanguage input field, which is the language in which the reviews should be scraped (Only the reviews in the selected language will be scraped).
Added availableReviewsLanguages field to output, which contains a list of languages in which the business has reviews in.
2021-11-19
Features
Update SDK
Fixes
Random crash when scraping images
Fix log for number of scraped businesses
Fix website extraction
2021-04-19
Fixes
Fixed page layout change (whole scraper was broken)
2021-03-30
Added support to different languages domains such as yelp.fr to input url.
2020-03-25
Features
Enhanced reviews with more fields: language, isFunnyCount, isUsefulCount, isCoolCount,reviewerName, reviewerUrl, reviewerReviewCount, reviewerLocation
Scraping reviewerName, reviewerUrl requires enabling personal data input fields: scrapeReviewerName, scrapeReviewerUrl
Added section about GDPR and personal data protection to README
2020-02-20
searchTerm and location deprecated in favor of searchTerms and locations. You can scrape any number of those in a single run now.
Refactored to SDK 1
2020-12-01
Fixed for new layout
2020-09-25
Added maxRequestRetries to input and increased its default from 2 to 10
Added cuisine to output
Added website to output
Added images to output
1.0.0
Data format changed (refer to README.md)
Fixed scraping information of business
Updated SDK to 0.21+
Minor changes to code style and linting
Updated dependencies
priceRange field changed from $ / $$$ to actual prices like $10-30
Reviews dates are now a proper ISO date time string
Review texts now contains plain text instead of HTML