Yelp Business Info Scraper avatar

Yelp Business Info Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Yelp Business Info Scraper

Yelp Business Info Scraper

🔎 Yelp Business Info Scraper pulls structured data from Yelp—business names, addresses, phones, ratings, reviews, categories, hours & websites. 🚀 Ideal for lead gen, local SEO, and market research. 📊 Keyword/location targeting. 📦 Exports CSV/JSON.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScrapePilot

ScrapePilot

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

2 days ago

Last modified

Share

Scrape detailed business information from Yelp business pages at scale. This Apify Actor extracts title, rating, reviews, address, phone, hours, images, categories, services, business owner, "about", review highlights, and more from one or many Yelp URLs.

How it works

The actor performs a Chrome-impersonated HTTP request (via curl_cffi) to each Yelp business page, then parses the embedded Apollo/GraphQL cache (<script data-apollo-state>) to produce a structured record. It does not use a headless browser.

Anti-blocking strategy

  1. Apify residential proxy for every request.
  2. curl_cffi Chrome 131 impersonation so the TLS/JA3 fingerprint matches a real browser.
  3. Retries with exponential backoff — up to 3 attempts per URL.
  4. DataDome challenge solving — when a var dd = {...} challenge with rt='i' is detected, the actor calls CapSolver AntiDatadomeTask with the challenge parameters, then refetches the URL with the returned datadome cookie.
  5. Hard-reject skiprt='c' challenges have no puzzle, so the actor stops retrying that IP pool instead of burning attempts.
  6. Translate-proxy fallback — as a last resort the URL is fetched through translate.google.com, which Yelp's CDN treats as benign and which preserves the Apollo JSON intact.
  7. Live saving — every record is pushed to the dataset as soon as it's parsed.

Input

FieldTypeRequiredDescription
startUrlsarrayYesList of Yelp business page URLs (e.g. https://www.yelp.com/biz/east-village-pizza-new-york).
proxyConfigurationobjectNoApify proxy settings (collapsed by default). Defaults to residential proxy.

Example input:

{
"startUrls": [
{ "url": "https://www.yelp.com/biz/east-village-pizza-new-york" }
],
"proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }
}

Output

Each business is pushed to the default dataset with this shape:

FieldDescription
titleBusiness name
ratingNumeric rating (e.g. "4.1")
reviewCounte.g. "651 reviews"
isClaimed"Claimed" or "Unclaimed"
priceLevele.g. "$", "$$"
categoriesComma-separated categories
fullAddress, city, state, zipcodeAddress fields
phoneNumberFormatted phone
imagesArray of large image URLs
websiteBusiness website URL
hoursMap of Mon/Tue/…/Sun (plus upcoming special dates) to hour strings, with Open now/Closed now appended on today
businessOwnerName, aboutOwner display name and combined specialties/history
reviewhighlightsArray of review-highlight snippets
businessServicesObject of service name → boolean (delivery, take-out, accessibility, payment, etc.)
yelp_biz_idYelp internal business ID
timestampUTC scrape time
url, is_page_not_found, statusURL, 404 flag, "SUCCEEDED" or "FAILED"

Run via API (cURL)

curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-d '{"startUrls":[{"url":"https://www.yelp.com/biz/east-village-pizza-new-york"}]}' \
"https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs?token=YOUR_APIFY_TOKEN"

FAQ

Why did some URLs fail? A page may be removed (is_page_not_found: true), the IP pool may be hard-rejected (rt='c'), or the translate fallback may have returned a short body. Check the log for [fetch] messages.

Cautions

  • Only public Yelp pages are scraped.
  • You are responsible for complying with applicable laws (privacy, data protection, terms of use).