Yelp Scraper — Business Profiles + Reviews
Pricing
Pay per usage
Yelp Scraper — Business Profiles + Reviews
Scrape Yelp business profiles and reviews by Yelp URL, alias, or business name + location. Returns reviewer, star rating, full text, date, attached photos, business response, helpful-vote reactions, plus the business profile (rating, total reviews, address, phone, categories, price range, hours).
Pricing
Pay per usage
Rating
0.0
(0)
Developer
yossef Nagy
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Scrape Yelp business profiles and reviews — by Yelp encoded biz ID (fastest, ~9 seconds for 30 reviews + full profile), Yelp URL, alias, or business name + location. No login, no Yelp API key.
What it does
Give it one or more Yelp businesses (or a search query) and it returns:
- Reviews — every review the public Yelp panel exposes for the business, with reviewer name and location, elite-all-star status and year, profile photo URL, star rating (1–5), full review text in original language, ISO-8601 review date, helpful-vote count, attached review photos and videos, first-reviewer flag, check-in count, business owner response when present.
- Business profile (optional) — overall rating, total review count, full address (line 1/2/3, city, region, postal code, country), phone, categories with parent category root,
$$/$$$$price range, opening hours per day, primary photo URL.
Yelp's frontend is powered by an unauthenticated GraphQL endpoint at /gql/batch. The actor calls that endpoint directly using persisted-query IDs — no Yelp account, no API key, no CSRF token required. Pagination uses Yelp's cursor-based pageInfo.endCursor.
The one step that needs a real browser is resolving the human-readable alias (the-french-laundry-yountville) to Yelp's internal 22-char encoded biz id, which is embedded in the biz page HTML as <meta name="yelp-biz-id"> behind a DataDome challenge. The actor uses headed Chrome + playwright-extra stealth + fingerprint-injector + Apify residential proxy to navigate that one page; if you already have the encoded biz id, you can skip this step entirely — pass the encId directly and the actor will run purely over HTTP.
Use cases
- Reputation monitoring — bulk-pull recent reviews for your own locations on Yelp.
- Competitor sentiment — feed competitor URLs, sort by
newest, analyze tone over time. - Multi-location chains — pass 100s of Yelp aliases or a search query like
pizza :: New York, NY→ every visible review for every store in one run. - Owner-response audits — flag reviews missing an owner reply.
- Lead qualification / CRM enrichment — score a prospect's customer experience before outreach (phone, address, categories are returned with the profile).
- AI training data — clean, structured review corpus with reviewer profile, rating, language, and date.
- Complaint mining — set
ratingsFilterto["1","2"]andreviewsSorttonewestto pull only recent low-star reviews for a topic-modeling pipeline.
Input
| Field | Type | Description |
|---|---|---|
businesses | array | Yelp URLs (https://www.yelp.com/biz/the-french-laundry-yountville), bare aliases (the-french-laundry-yountville), encoded biz IDs (the 22-char id Yelp uses internally — fastest if you have it), or free-text fallback "Business Name :: City, State". One run can cover many. |
searchQueries | array | Alternative input: each entry "what :: where" (e.g. "pizza :: San Francisco, CA"). Yelp search is run and every matching business is scraped, up to maxBusinessesPerSearch. |
maxBusinessesPerSearch | integer | Cap per search query. Default 20. |
maxReviewsPerBusiness | integer | Reviews per business. Default 100. Set 0 for "all available". |
reviewsSort | string | newest, oldest, most_relevant (default), highest_rating, lowest_rating. |
ratingsFilter | array of strings | Star ratings to include. Default ["5","4","3","2","1"]. Use e.g. ["1","2"] for complaint-only runs. |
includeBusinessProfile | boolean | Push one extra item per business with full profile. Default true. |
proxyConfiguration | object | Defaults to Apify RESIDENTIAL — Yelp's biz page is DataDome-protected and rejects datacenter IPs. |
maxConcurrency | integer | Businesses processed in parallel. Default 4. |
Identifiers accepted:
- Encoded biz id (22 chars, base64url):
T20VEwi7AzKbY2TuVEt_ig— this is what Yelp's API uses internally; passing it skips the alias→encid resolution step entirely (fastest path: ~9s for 30 reviews + profile). To find it: open any Yelp biz page → View Source (Ctrl+U / Cmd+Opt+U) → search foryelp-biz-id. - Full Yelp URL:
https://www.yelp.com/biz/the-french-laundry-yountville— actor resolves the alias to encId via headed browser (adds ~30-90s, may need retry on heavily-challenged IP). - Bare alias:
the-french-laundry-yountville— same as URL. - Free-text fallback:
Sweet Maple :: San Francisco, CA— runs Yelp search to find the business.
Output
Each review item:
{"_kind": "review","review_id": "pJ5tnzX01EC2n6wH7ru0Bw","business_encId": "T20VEwi7AzKbY2TuVEt_ig","business_alias": "the-french-laundry-yountville-7","business_name": "The French Laundry","rating": 5,"text": "I've tried several Michelin starred restaurants...","language": "en","review_date": "2026-05-03T01:36:33-07:00","date_of_experience": null,"author_id": "OzTSufmqWNFqbEaMlguT2w","author_name": "Lola F.","author_location": "San Rafael, CA","author_review_count": 2,"author_friend_count": 0,"author_photo_count": 0,"is_elite": false,"elite_year": null,"author_photo_url": null,"photos": [],"videos": [],"likes_helpful": 0,"business_response": null,"business_response_date": null,"first_reviewer": false,"check_in_count": 0,"url": "https://www.yelp.com/biz/the-french-laundry-yountville-7?hrid=pJ5tnzX01EC2n6wH7ru0Bw","scraped_at": "2026-05-15T14:00:00.000Z"}
Each business item (_kind: "business"): encBizId, business_alias, business_name, rating, total_reviews, price_range, phone, categories[] (each with title, alias, root), address_line1, address_line2, address_line3, city, region, postal_code, country, hours (one entry per day with isOpen and time ranges[]), primary_photo_url, url.
Fields are null (or [] for lists) only when the value genuinely isn't on the Yelp record — e.g. no phone listed, no photos attached to the review, no owner response yet.
Pricing
Pay per result:
- $0.002 per review scraped
- $0.005 per business profile (only when
includeBusinessProfileis enabled)
For a 100-review scrape with the profile included, that's about $0.205 — competitive with every existing Yelp scraper on the Store and a fraction of any flat-fee Apollo/ZoomInfo enrichment.
How it works (technical)
- Pure HTTP GraphQL for business profiles and the review feed. Persisted-query IDs
19c1538e…b2e1(GetLocalBusinessJsonLinkedData) and44d0ed38…d9ee(GetBusinessReviewFeed), called againsthttps://www.yelp.com/gql/batch. No CSRF token, no cookies, no login. Cursor-based pagination viapageInfo.endCursor→after. The GraphQL endpoint accepts unauthenticated requests, which is why encId input scrapes complete in seconds. - Stealth-mode headed Chrome (Playwright +
playwright-extrastealth plugin + Apifyfingerprint-injector) for the one step that needs a real browser: resolving the human alias to the encoded biz id. Run throughxvfb-runin the Apify container. Sessions rotate on block. - Apify RESIDENTIAL proxy by default for both the GraphQL calls and the browser navigation, with session-id rotation on every retry.
- Migration-safe state:
pushedReviewIds,pushedBusinesses,doneAliasespersisted to the key-value store after every page, so an Apify host migration mid-run doesn't duplicate or skip work.
Notes
- Residential proxy is the default. Datacenter IPs are reliably blocked by DataDome on Yelp biz-page HTML; the GraphQL endpoint is more permissive but using residential keeps the IP consistent across both kinds of requests.
- Fastest path: pass the encoded biz id. The actor runs purely over HTTP and completes 30 reviews + full profile in under 10 seconds.
- The actor handles Apify host migrations transparently (state persisted to the key-value store) and rotates the proxy session if blocked.
- If you batch-scrape from URLs or aliases, expect ~30-90 seconds extra per business for the browser-based alias resolution, and occasional failures on heavily-challenged proxy IPs. Mix in encIds wherever you have them to keep runs fast.