Yelp Scraper — Business Profiles + Reviews avatar

Yelp Scraper — Business Profiles + Reviews

Pricing

Pay per usage

Go to Apify Store
Yelp Scraper — Business Profiles + Reviews

Yelp Scraper — Business Profiles + Reviews

Scrape Yelp business profiles and reviews by Yelp URL, alias, or business name + location. Returns reviewer, star rating, full text, date, attached photos, business response, helpful-vote reactions, plus the business profile (rating, total reviews, address, phone, categories, price range, hours).

Pricing

Pay per usage

Rating

0.0

(0)

Developer

yossef Nagy

yossef Nagy

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Scrape Yelp business profiles and reviews — by Yelp encoded biz ID (fastest, ~9 seconds for 30 reviews + full profile), Yelp URL, alias, or business name + location. No login, no Yelp API key.

What it does

Give it one or more Yelp businesses (or a search query) and it returns:

  • Reviews — every review the public Yelp panel exposes for the business, with reviewer name and location, elite-all-star status and year, profile photo URL, star rating (1–5), full review text in original language, ISO-8601 review date, helpful-vote count, attached review photos and videos, first-reviewer flag, check-in count, business owner response when present.
  • Business profile (optional) — overall rating, total review count, full address (line 1/2/3, city, region, postal code, country), phone, categories with parent category root, $$/$$$$ price range, opening hours per day, primary photo URL.

Yelp's frontend is powered by an unauthenticated GraphQL endpoint at /gql/batch. The actor calls that endpoint directly using persisted-query IDs — no Yelp account, no API key, no CSRF token required. Pagination uses Yelp's cursor-based pageInfo.endCursor.

The one step that needs a real browser is resolving the human-readable alias (the-french-laundry-yountville) to Yelp's internal 22-char encoded biz id, which is embedded in the biz page HTML as <meta name="yelp-biz-id"> behind a DataDome challenge. The actor uses headed Chrome + playwright-extra stealth + fingerprint-injector + Apify residential proxy to navigate that one page; if you already have the encoded biz id, you can skip this step entirely — pass the encId directly and the actor will run purely over HTTP.

Use cases

  • Reputation monitoring — bulk-pull recent reviews for your own locations on Yelp.
  • Competitor sentiment — feed competitor URLs, sort by newest, analyze tone over time.
  • Multi-location chains — pass 100s of Yelp aliases or a search query like pizza :: New York, NY → every visible review for every store in one run.
  • Owner-response audits — flag reviews missing an owner reply.
  • Lead qualification / CRM enrichment — score a prospect's customer experience before outreach (phone, address, categories are returned with the profile).
  • AI training data — clean, structured review corpus with reviewer profile, rating, language, and date.
  • Complaint mining — set ratingsFilter to ["1","2"] and reviewsSort to newest to pull only recent low-star reviews for a topic-modeling pipeline.

Input

FieldTypeDescription
businessesarrayYelp URLs (https://www.yelp.com/biz/the-french-laundry-yountville), bare aliases (the-french-laundry-yountville), encoded biz IDs (the 22-char id Yelp uses internally — fastest if you have it), or free-text fallback "Business Name :: City, State". One run can cover many.
searchQueriesarrayAlternative input: each entry "what :: where" (e.g. "pizza :: San Francisco, CA"). Yelp search is run and every matching business is scraped, up to maxBusinessesPerSearch.
maxBusinessesPerSearchintegerCap per search query. Default 20.
maxReviewsPerBusinessintegerReviews per business. Default 100. Set 0 for "all available".
reviewsSortstringnewest, oldest, most_relevant (default), highest_rating, lowest_rating.
ratingsFilterarray of stringsStar ratings to include. Default ["5","4","3","2","1"]. Use e.g. ["1","2"] for complaint-only runs.
includeBusinessProfilebooleanPush one extra item per business with full profile. Default true.
proxyConfigurationobjectDefaults to Apify RESIDENTIAL — Yelp's biz page is DataDome-protected and rejects datacenter IPs.
maxConcurrencyintegerBusinesses processed in parallel. Default 4.

Identifiers accepted:

  • Encoded biz id (22 chars, base64url): T20VEwi7AzKbY2TuVEt_ig — this is what Yelp's API uses internally; passing it skips the alias→encid resolution step entirely (fastest path: ~9s for 30 reviews + profile). To find it: open any Yelp biz page → View Source (Ctrl+U / Cmd+Opt+U) → search for yelp-biz-id.
  • Full Yelp URL: https://www.yelp.com/biz/the-french-laundry-yountville — actor resolves the alias to encId via headed browser (adds ~30-90s, may need retry on heavily-challenged IP).
  • Bare alias: the-french-laundry-yountville — same as URL.
  • Free-text fallback: Sweet Maple :: San Francisco, CA — runs Yelp search to find the business.

Output

Each review item:

{
"_kind": "review",
"review_id": "pJ5tnzX01EC2n6wH7ru0Bw",
"business_encId": "T20VEwi7AzKbY2TuVEt_ig",
"business_alias": "the-french-laundry-yountville-7",
"business_name": "The French Laundry",
"rating": 5,
"text": "I've tried several Michelin starred restaurants...",
"language": "en",
"review_date": "2026-05-03T01:36:33-07:00",
"date_of_experience": null,
"author_id": "OzTSufmqWNFqbEaMlguT2w",
"author_name": "Lola F.",
"author_location": "San Rafael, CA",
"author_review_count": 2,
"author_friend_count": 0,
"author_photo_count": 0,
"is_elite": false,
"elite_year": null,
"author_photo_url": null,
"photos": [],
"videos": [],
"likes_helpful": 0,
"business_response": null,
"business_response_date": null,
"first_reviewer": false,
"check_in_count": 0,
"url": "https://www.yelp.com/biz/the-french-laundry-yountville-7?hrid=pJ5tnzX01EC2n6wH7ru0Bw",
"scraped_at": "2026-05-15T14:00:00.000Z"
}

Each business item (_kind: "business"): encBizId, business_alias, business_name, rating, total_reviews, price_range, phone, categories[] (each with title, alias, root), address_line1, address_line2, address_line3, city, region, postal_code, country, hours (one entry per day with isOpen and time ranges[]), primary_photo_url, url.

Fields are null (or [] for lists) only when the value genuinely isn't on the Yelp record — e.g. no phone listed, no photos attached to the review, no owner response yet.

Pricing

Pay per result:

  • $0.002 per review scraped
  • $0.005 per business profile (only when includeBusinessProfile is enabled)

For a 100-review scrape with the profile included, that's about $0.205 — competitive with every existing Yelp scraper on the Store and a fraction of any flat-fee Apollo/ZoomInfo enrichment.

How it works (technical)

  • Pure HTTP GraphQL for business profiles and the review feed. Persisted-query IDs 19c1538e…b2e1 (GetLocalBusinessJsonLinkedData) and 44d0ed38…d9ee (GetBusinessReviewFeed), called against https://www.yelp.com/gql/batch. No CSRF token, no cookies, no login. Cursor-based pagination via pageInfo.endCursorafter. The GraphQL endpoint accepts unauthenticated requests, which is why encId input scrapes complete in seconds.
  • Stealth-mode headed Chrome (Playwright + playwright-extra stealth plugin + Apify fingerprint-injector) for the one step that needs a real browser: resolving the human alias to the encoded biz id. Run through xvfb-run in the Apify container. Sessions rotate on block.
  • Apify RESIDENTIAL proxy by default for both the GraphQL calls and the browser navigation, with session-id rotation on every retry.
  • Migration-safe state: pushedReviewIds, pushedBusinesses, doneAliases persisted to the key-value store after every page, so an Apify host migration mid-run doesn't duplicate or skip work.

Notes

  • Residential proxy is the default. Datacenter IPs are reliably blocked by DataDome on Yelp biz-page HTML; the GraphQL endpoint is more permissive but using residential keeps the IP consistent across both kinds of requests.
  • Fastest path: pass the encoded biz id. The actor runs purely over HTTP and completes 30 reviews + full profile in under 10 seconds.
  • The actor handles Apify host migrations transparently (state persisted to the key-value store) and rotates the proxy session if blocked.
  • If you batch-scrape from URLs or aliases, expect ~30-90 seconds extra per business for the browser-based alias resolution, and occasional failures on heavily-challenged proxy IPs. Mix in encIds wherever you have them to keep runs fast.