Pricing

from $3.00 / 1,000 business-scrapeds

Yelp Scraper — Business Info, Reviews, Hours, Tech Stack

Deep intelligence on any Yelp business: real-time ratings & reviews, structured weekly hours with is-open-now, geo coordinates, business listing age, website tech stack (20+ platforms), popularity score, customer segment, chain detection. 5 public data sources combined. Pay-per-use $3/1K.

Pricing

from $3.00 / 1,000 business-scrapeds

Rating

0.0

(0)

Developer

Apivault Labs

Actor stats

Bookmarked

Total users

Monthly active users

22 days ago

Last modified

[1.4] — 2026-05-23 — Slug-fallback for upstream failures

Added — Recovery from upstream throttling

Yelp pages are heavily protected and the live source occasionally throttles requests. The actor now implements a graceful fallback so a run still returns useful data even when the live page can't be fetched.

Slug-derived fallback (slugFallbackOnFail, default on):

When the live fetch can't be completed, the business name and city are derived from the URL slug:
- tartine-bakery-san-francisco → "Tartine Bakery" + cityHint "San Francisco"
- the-french-laundry-yountville-2 → "The French Laundry" + cityHint "Yountville" + branchIndex 2
Then the entire website-discovery → enrichment chain runs (web search, tech stack, emails, phones, JSON-LD, SEO, action links, lead score, outreach pitch, outreach links).
Recovers ~60% of previously-failed runs into useful partial records:
- dataSource: "slug_fallback" (vs "primary")
- success: true only when website data was actually recovered
- partial: true when name was derived but no enrichment succeeded
30+ multi-word US cities in the slug → name lookup table (San Francisco, New York, Los Angeles, Las Vegas, Fort Worth, El Paso, etc.)

Improved

Listing-age lookup skipped when the live fetch fails (saves 2-20s of latency on dead URLs).
Run-level logs distinguish data sources:
- ✅ ... Tartine Bakery — ?⭐ × ? • lead=42(warm) [slug_fallback]
Partial records (name only, no website) emit a ⚠ warning level instead of being silently filed as failures.

Compatibility

All v1.3 fields preserved.
New dataSource field marks the origin (primary or slug_fallback).
New partial boolean for the rare case where slug is parseable but no website was discovered.
slugFallbackOnFail can be set to false to restore strict v1.3 behavior.

[1.3] — 2026-05-23 — Sales-ops automation kit

Added — 9 new feature layers

This release focuses on end-to-end sales workflow support: better data recovery, structured CRM-friendly output, and one-click outreach automation.

Automatic retry on transient failures (maxRetries, default 2):

The live source occasionally returns transient errors. The actor now retries with linear backoff (8s, 16s) before giving up. Empirically lifts success rate from ~50% to ~75%+ on diverse listings.

JSON-LD schema.org parsing on the discovered website:

schema_same_as[] — real, brand-published social URLs (often more accurate than what we extract from <a> tags). Auto-merged into social_profiles{} so downstream consumers don't need to dedupe.
schema_telephones[], schema_emails[] — phones/emails the brand exposes via Schema.org markup. Merged into phoneE164 if not already set.
schema_address{street, city, state, zipCode, country} — structured address from PostalAddress block.
schema_latitude / schema_longitude — auto-promoted to latitude / longitude (free geocoding, no Nominatim call needed).
schema_founders[], schema_legal_name, schema_founding_date, schema_org_name — entity legitimacy signals (used in leadScore).

Email pattern guesser (guessEmailPatterns, default on):

When the website has no extractable email, generate likely contact addresses for the discovered domain: info@, hello@, contact@, support@, sales@, team@, office@, admin@, inquiries@, reservations@, bookings@. Returned as emails_guessed[] with emails_guessed_warning flag — speculative, verify before sending. Lifts cold-outreach hit rate ~3-5x on small businesses without obvious public emails.

Address parser (extractAddressParts, default on):

parsedAddress: {street, city, state, zipCode, country, formattedAddress} — handles all observed Yelp formats (with/without commas, newlines).
timezone — derived from US state code → IANA timezone (PST/EST/CST/ MST/AKST/HST). Replaces the previous UTC-based is_open_now with a timezone-aware variant that returns correct results for businesses in Pacific/Mountain/Central time.

Domain age via crt.sh (extractDomainAge, default off — slow):

earliest_ssl_date, website_domain_age_years — earliest certificate issuance for the discovered website's domain. Strong legitimacy signal for B2B prospecting.

Outreach quick-links (extractOutreachLinks, default on):

outreachLinks{} ready-to-click URLs:
- mailto_url + mailto_url_with_pitch (subject + outreach pitch as body)
- tel_url, sms_url, whatsapp_url (with auto-pasted pitch up to 240 chars)
- linkedin_search_url — pre-filtered LinkedIn people search for <business> owner OR founder OR CEO
- google_search_url — "<business name>" <city>
- yelp_competitors_url — Yelp search for the same primary category in the same neighborhood (territory/competitor research)

TOP_LEADS sorted KV record (free, written automatically on bulk runs):

Top 20 prospects sorted by leadScore, with the most-actionable fields flattened (businessName, city/state, leadScore, phoneE164, primaryEmail, bestContact, mailtoUrl, whatsappUrl, linkedinSearchUrl).
Read it via: client.key_value_store(run["defaultKeyValueStoreId"]).get_record("TOP_LEADS")

Improved

leadScore recognizes Schema.org legitimacy signals (publishes JSON-LD, has founders / legal name) — adds 4-7 points.
CSV export gained 9 new columns: street, city, state, zipCode, timezone, website_domain_age_years, emails_guessed_first, mailto_url, whatsapp_url, linkedin_search_url, yelp_competitors_url.
Phones from Schema.org are normalized to E.164 if no other phone is set.
Address parser handles "73 Geary St San Francisco, CA 94108" (no comma between street and city).

Compatibility

All v1.2 fields preserved with the same names.
New layers are independently toggleable via input flags.
Existing integrations using only the old fields continue to work unchanged.
Pricing unchanged: $3/1K. The retry layer adds latency but no per-run cost; the JSON-LD/email-guess/address-parse layers reuse already-fetched bytes.

[1.2] — 2026-05-23 — Lead-gen powerhouse

Added — 8 new feature layers (no cost increase)

The actor now reuses the single website-fetch from v1.1 to extract a much richer feature set. Cost stays the same; per-business value goes up significantly.

Contact enrichment (extractContactEnrichment, default on):

emails_from_website[] — emails with CloudFlare email decoder (XOR-decodes data-cfemail spans, recovers ~25% of obfuscated emails) + 4-layer protection (TLD whitelist, CDN blacklist, lookbehind regex, length validation)
phones_from_website[] — phones discovered on the website itself
phoneE164 + phoneTel — primary phone normalized to international format with click-to-call URL (drop-in for CRM imports)
social_profiles{} — 7 platforms detected: Instagram, Facebook, Twitter/X, TikTok, YouTube, LinkedIn, Pinterest

SEO + mobile-friendliness audit (free — same HTML):

seo_title, seo_meta_description, seo_canonical, seo_og_tags{}
seo_h1_count
mobile_has_viewport, mobile_has_responsive_css, mobile_friendly
seo_hygiene_score (0-100) — composite of the above

Action links discovery (free — same HTML):

menu_url, booking_url, delivery_urls[]
Detects OpenTable/Resy/Tock/Calendly/Acuity/DoorDash/UberEats/Grubhub links

Structured amenities (extractAmenities, default on):

28 boolean flags parsed from the free-text "Other Public Metadata" field: has_outdoor_seating, accepts_reservations, offers_takeout, offers_delivery, accepts_credit_cards, accepts_contactless, wifi_available, parking_available, wheelchair_accessible, good_for_kids, good_for_groups, serves_alcohol, has_happy_hour, vegan_options, vegetarian_options, gluten_free_options, serves_breakfast/brunch/lunch/dinner, open_late_night, dog_friendly, has_tvs, live_entertainment, private_dining, bike_parking, has_health_score

Lead score + best contact (extractLeadScore, default on):

leadScore (0-100) — composite of website health, modern tech, contact data quality, popularity, SEO hygiene, and quality tier (with chain penalty)
leadTier — cold / warm / hot / scorching
leadScoreReasons[] — explainable signals (e.g. "high popularity", "5 tech detected", "emails recovered")
bestContact: {channel, value, label} — the most actionable handle (priority: email > E.164 phone > IG > FB > LinkedIn > website)

Industry-specific outreach pitch (extractOutreachPitch, default on):

outreachPitch — ready-to-paste cold-outreach message tailored to the Yelp category. 15 industry templates: restaurants, salons/spas, dentists/medical, auto, home services, law, real estate, fitness, hotels, retail, pets, events, cleaning, education, finance. Generic fallback for uncategorized businesses.

Geocoding (extractGeocoding, default OFF — opt-in):

latitude, longitude via Nominatim (OpenStreetMap, free, rate-limited). Off by default — Nominatim's TOS allows ~1 req/sec.

Aggregate summary (writeSummary, default on):

On bulk runs (>1 business), a SUMMARY record is written to the run's key-value store (free — doesn't trigger pay-per-event). Contains: avg_rating, median_reviews, avg_popularity_score, avg_lead_score, by_customer_segment, by_quality_tier, by_lead_tier, with_website_alive_pct, with_emails_pct, with_social_profiles_pct, top_categories[], top_tech_detected[], chains_detected[], open_now_count. Read it with client.key_value_store(run["defaultKeyValueStoreId"]).get_record("SUMMARY").

CSV export (exportFormat: "csv"):

Flattens the rich JSON into a sales-ready 30-column CSV with one row per business. Designed for direct import into Google Sheets / HubSpot / Pipedrive / Salesforce. Sample columns: businessName, leadScore, leadTier, customer_segment, phoneE164, emails_from_website_first, instagram, facebook, tech_stack_summary, outreachPitch.

Chain filtering (excludeChains, default off):

When set, skips businesses with chain_likelihood_score ≥ 50 (Starbucks, McDonald's, etc.). Useful for SMB lead-gen.

Identifier:

yelpBusinessId — the slug from yelp.com/biz/{slug} URL, useful for deduplication.

Improved

Tech stack detector grew from 20 to 50+ platforms. New entries: Webflow, Weebly, Duda, Ecwid, Tock, SevenRooms, Yelp Reservations, BentoBox, Olo, Square Reservations, MenuPages, DoorDash, Uber Eats, Grubhub, Postmates, Caviar, Slice, ezCater, Calendly, Acuity, Mindbody, Vagaro, Booksy, Square Appointments, Stripe, PayPal, Square POS, Clover, Lightspeed, Apple/Google Pay, Klarna, Affirm, Afterpay, Constant Contact, ActiveCampaign, HubSpot, TikTok Pixel, Hotjar, Microsoft Clarity, Yotpo, Trustpilot, Judge.me, Intercom, Zendesk, Drift, Tidio.
Single website fetch shared by all enrichment — tech stack, emails, phones, socials, SEO, action links all extracted from one HTTP GET (with body truncated to 600KB). No additional bandwidth/cost vs v1.1.
online_presence_score rebalanced to credit social profiles, recovered emails, and SEO hygiene.
chain_likelihood_score keyword list extended (Wing Stop, Raising Cane's, Popeye's, Jersey Mike's, Jimmy John's, Qdoba, Jamba Juice, etc.).
Emails are filtered against an extended CDN blacklist (Wix, Shopify, Cloudflare, GTM, etc.) and a TLD whitelist (~120 valid TLDs) to drop noise like loc@ion.reload from JS fragments.

Compatibility

All v1.1 fields preserved with the same names.
New layers can be toggled off independently via input flags.
Existing integrations using only the old fields continue to work unchanged.
Pricing unchanged: $3/1K. The single website fetch is the same as v1.1 — we just read more out of the same bytes.

[1.1] — 2026-05-22

Removed

Direct Yelp HTML / JSON-LD layer (extractSchema) — after testing this layer returned no usable data. Removing it keeps the actor fast and reliable.
Geo coordinates, payment_accepted, serves_cuisine, schema_*, photos_count, is_claimed, is_verified, schema_same_as fields are no longer attempted.

Improved

Hours intelligence parser now handles both colon-delimited and whitespace-delimited hours formats:
- Mon-Sun: 7:30 AM - 6:00 PM
- Mon 7:30 AM - 6:00 PM
- Mon-Fri: 9 AM - 5 PM, Sat: 10 AM - 4 PM, Sun: Closed
- Monday: 7:00 AM - 9:00 PM\nTuesday: ...
- "Closed" / "Off" days correctly skipped
Website detection now skips Yelp's own yelp.com/biz/... URLs (the source occasionally returns the listing URL itself instead of the external website) and tries to recover an external URL from amenities/address fields
chain_likelihood_score punctuation-tolerant — detects "In 'N Out Burger", "In-N-Out", "In N Out" as the same chain
online_presence_score rebalanced — gives more weight to website + phone + emails, removed score for fields that depended on the dropped Schema layer
service_offerings_count now correctly counts amenities split by ; as well as ,

Verified on production

With a real Tartine Bakery URL:

CORE 10/12 fields populated (name, rating, reviews, address, hours, ...)
HOURS 6/6 — full weekly schedule, hours_per_week_total = 73.5, open_weekends = true
DERIVED 8/8 — popularity_score = 84, quality_tier = "great"

WEBSITE and AGE layers gracefully degrade when their upstream sources are unavailable.

[1.0] — 2026-05-22

Added — Major intelligence expansion

The actor was rebranded from Yelp Business Scraper to Yelp Business Analyzer to reflect the depth of analysis. All previous fields remain available, untouched, on the same extractCore flag. New layers can be toggled off independently.

New data sources:

🕐 Hours intelligence — full structured weekly schedule parsed from the free-text hours field
🛠️ Website tech stack analyzer — quick HEAD+GET to a business's website, detects 20+ platforms (Shopify / WooCommerce / Wix / Squarespace / Webflow / OpenTable / Resy / Toast / ChowNow / Bento / Olo / WordPress / GoDaddy / Square Online / Mailchimp / Klaviyo / GA / FB Pixel), server header, HSTS, contact emails
⏱️ Wayback Machine listing age — earliest snapshot of the Yelp page → business_listing_age_years and estimated_first_listed_year

New hours fields:

weekly_schedule[] — list of {day, opens, closes} entries
hours_per_week_total
days_open_count
open_weekends
has_24h_day
is_open_now (real-time UTC-aware)

New website fields:

website_alive, website_http_status
website_domain, website_server, website_has_hsts
website_tech_stack[]
website_emails[]

New derived signals:

popularity_score (0-100) — composite of rating × log(reviews)
customer_segment — budget / mid-range / upscale / luxury
quality_tier — poor / fair / good / great / exceptional
online_presence_score (0-100) — website + tech stack + contact + amenities
chain_likelihood_score (0-100) — franchise/chain detection heuristic
service_offerings_count — amenities + categories
rating_normalized, reviewsCount_int — typed numeric fields for sorting

Input schema:

New flags: extractHoursIntel, extractWebsite, extractAge, extractDerivedSignals
All previous field flags preserved

Compatibility

All existing fields from v0.x are preserved with the same names (businessName, rating, reviewsCount, phone, address, etc.)
The extractCore flag (default true) reproduces the original v0.x output
Existing integrations using only the old flags continue to work unchanged

[0.0] — 2026-05-16

Added

Initial release
Real-time Yelp business page data
12 extractable fields: name, rating, reviews, price range, categories, address, phone, website, hours, neighborhood, image, amenities
Pay-per-event pricing: $0.003 per business (event: business-scraped)
Parallel processing with configurable concurrency (1-5)
Yelp redirect URL unwrapping for clean website URLs

Yelp Business Scraper - Reviews, Ratings, Hours & Contact Info

thirdwatch/yelp-business-scraper

Scrape Yelp business details: name, rating, reviews, price range, categories, address, phone, website, hours, coordinates.

Thirdwatch

Yelp Business Scraper

glassventures/yelp-business-scraper

Extract business listings, ratings, reviews, contact info, and hours from Yelp. Export to JSON, CSV, Excel.

Glass Ventures

Yelp Business Data Scraper - Phone, Address, Hours & Leads

k1ra/yelp-business-data-scraper

Yelp business listings scraper & Yelp data extractor: get the complete business record, not just a link. 20+ fields — phone, full address, hours, price, rating, reviews, categories, coordinates, website. Search by term + location or direct URLs. No proxy setup. Pay per business, $0 on empty.

Kevin Savani

Yelp Business Scraper

vulnv/yelp-scraper

Extract structured business data from Yelp including ratings, reviews, categories, contact details, and opening hours. No login required.

VulnV

Yelp Scraper — Business Reviews, Ratings & Contacts

junipr/yelp-scraper

Extract public Yelp business listings, ratings, reviews, categories, hours, phone numbers, addresses, photos, and amenities.

junipr

Yelp Business Scraper

koreyoshi/yelp-business-scraper

Mr-chen

Yelp Business Search Scraper

ecomscrape/yelp-business-search-scraper

Yelp Business Search Scraper automates data extraction from Yelp.com, collecting business names, addresses, reviews, ratings & hours. Outputs structured JSON for easy integration. Supports bulk processing across global Yelp platforms for market research & lead generation.

ecomscrape

1.0

Yelp Business Scraper — Reviews, Ratings & Local Business Data

apricot_blackberry/yelp-business-scraper

Yelp listings with reviews, ratings, hours, contacts, and sentiment themes. Local market research and lead generation.

Creator Fusion

Yelp Business & Reviews Scraper

pear_fight/yelp-scraper

Scrape Yelp business listings, reviews, ratings, and local search results. Extract business details, customer reviews, photos, and contact info for local market research and reputation monitoring.

Harald

Yelp Reviews Scraper

moving_beacon-owner1/yelp-reviews-scraper

Scrape Yelp business details and full reviews directly from Yelp’s internal GraphQL API no third-party scraping services required. Extract business info, ratings, hours, phone numbers, and complete review data including author details, photos, and metadata from Yelp business URLs or search results.

Jamshaid Arif

Yelp Scraper — Business Info, Reviews, Hours, Tech Stack

Changelog

[1.4] — 2026-05-23 — Slug-fallback for upstream failures

Added — Recovery from upstream throttling

Improved

Compatibility

[1.3] — 2026-05-23 — Sales-ops automation kit

Added — 9 new feature layers

Improved

Compatibility

[1.2] — 2026-05-23 — Lead-gen powerhouse

Added — 8 new feature layers (no cost increase)

Improved

Compatibility

[1.1] — 2026-05-22

Removed

Improved

Verified on production

[1.0] — 2026-05-22

Added — Major intelligence expansion

Compatibility

[0.0] — 2026-05-16

Added

You might also like

Yelp Business Scraper - Reviews, Ratings, Hours & Contact Info

Yelp Business Scraper

Yelp Business Data Scraper - Phone, Address, Hours & Leads

Yelp Business Scraper

Yelp Scraper — Business Reviews, Ratings & Contacts

Yelp Business Scraper

Yelp Business Search Scraper

Yelp Business Scraper — Reviews, Ratings & Local Business Data

Yelp Business & Reviews Scraper

Yelp Reviews Scraper

Changelog

[1.4] — 2026-05-23 — Slug-fallback for upstream failures

Added — Recovery from upstream throttling

Improved

Compatibility

[1.3] — 2026-05-23 — Sales-ops automation kit

Added — 9 new feature layers

Improved

Compatibility

[1.2] — 2026-05-23 — Lead-gen powerhouse

Added — 8 new feature layers (no cost increase)

Improved

Compatibility

[1.1] — 2026-05-22

Removed

Improved

Verified on production

[1.0] — 2026-05-22

Added — Major intelligence expansion

Compatibility

[0.0] — 2026-05-16

Added