Yelp Reviews Scraper - Sentiment, Topics, Competitor Delta avatar

Yelp Reviews Scraper - Sentiment, Topics, Competitor Delta

Pricing

from $10.00 / 1,000 business summaries

Go to Apify Store
Yelp Reviews Scraper - Sentiment, Topics, Competitor Delta

Yelp Reviews Scraper - Sentiment, Topics, Competitor Delta

Bulk Yelp review scraper with per-review sentiment, topic clusters (food, service, wait, value, parking), responder tracking, 12-month trend and competitor delta. LLM-ready JSON for reputation, local SEO and chain ops teams.

Pricing

from $10.00 / 1,000 business summaries

Rating

0.0

(0)

Developer

Seibs.co

Seibs.co

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Yelp Reviews Pro

TL;DR for local SEO agencies, multi-unit franchise managers, and reputation-management teams: Pulls Yelp reviews for one or many businesses with built-in sentiment, 9-topic clustering, owner-response tracking, 12-month trend label, and pairwise competitor delta. Compared to a generic Yelp scraper, you get an intelligence layer on top (sentiment with negation handling, topic tags, response-gap metric, competitor delta, and an llm_ready Markdown summary mode for AI agents). Free Apify plan covers small business-input runs on your $5 platform credit. PPE charges scale per review. Upgrade to Apify Starter ($49/mo) for production volume.

Run it in 30 seconds

# Via the Apify Python SDK
from apify_client import ApifyClient
client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("seibs.co/yelp-reviews-pro").call(run_input={
"mode": "single_business",
"business_inputs": [
"https://www.yelp.com/biz/the-french-laundry-yountville"
],
"max_reviews_per_business": 200,
"include_topic_clustering": true
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

Or via curl:

curl -X POST "https://api.apify.com/v2/acts/seibs.co~yelp-reviews-pro/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"mode": "single_business", "business_inputs": ["https://www.yelp.com/biz/the-french-laundry-yountville"], "max_reviews_per_business": 200, "include_topic_clustering": true}'

Or click "Try for free" on this page if you prefer the no-code UI.

What you get

Each run produces:

  • A clean dataset, filterable in the Apify console and downloadable as CSV or JSON
  • An OUTPUT.html dashboard preview of your top records
  • A sample-output preview at ./.actor/sample-output.json

Per-archetype custom artifacts shipped with this actor:

  • top-negative-reviews.html (with suggested response templates and copy-to-clipboard buttons)
  • sentiment-trend.csv (12-month trend with improving / flat / declining label)
  • competitor-delta.csv (pairwise rating, response-rate, and top-complaint diff)

What does Yelp Reviews Pro do?

It wraps the agents/yelp-reviews upstream actor and layers an analysis pass on top: per-review sentiment with negation handling, topic tagging across nine common categories, owner-response stats, 12-month time-series with improving / flat / declining trend label, and pairwise competitor delta. Optional LLM-ready markdown summaries drop straight into agent prompts.

AI / RAG / Agent

Built for AI reputation-management agents and local-SEO bots. Set output_format=llm_ready to get a pre-summarized Markdown block per business (rating trend, top complaints, top praise, response gap, competitor delta) that a model can ingest in a single prompt. Per-review records carry sentiment, review_topics, reviewer_is_elite, and useful_count as embedding metadata. Compatible with LangChain, LlamaIndex, Pinecone, Weaviate, Chroma, and any MCP-aware agent runtime.

from apify_client import ApifyClient
client = ApifyClient("APIFY_TOKEN")
run = client.actor("you/yelp-reviews-pro").call(run_input={
"mode": "batch_analysis",
"business_inputs": [
"https://www.yelp.com/biz/joes-pizza-new-york-9",
"https://www.yelp.com/biz/prince-street-pizza-new-york"
],
"max_reviews_per_business": 200,
"review_sort": "newest",
"output_format": "llm_ready"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
if item.get("record_type") == "business_summary":
print(item["llm_summary_md"])

Features

  • Per-review sentiment (positive / neutral / negative) with sentiment_score in [-1, 1] - lexicon-based, no API key, no per-call cost.
  • Topic clustering - food_quality, service, wait_time, cleanliness, value_pricing, ambience, staff_friendliness, parking, accessibility, aggregated into topic_distribution and ranked into top_complaint_topics / top_praise_topics.
  • Responder tracking - business-owner response rate, average time-to-respond in days, owner reply sentiment distribution.
  • Time-series breakdown - last 12 months of review counts + average rating, with derived recent_trend.
  • Competitor delta mode - given 2-5 businesses, returns pairwise rating delta, review-count delta, common complaints, unique complaints per side.
  • LLM-ready output - set output_format=llm_ready and every record gets a llm_summary_md markdown block.
  • Yelp-native reviewer signals - reviewer_is_elite, reviewer_review_count, reviewer_friend_count, reviewer_photo_count (shill / credibility weighting).
  • Review reactions - useful_count, funny_count, cool_count per review.

Use cases

  • Chain reputation managers monitoring 20-500 Yelp locations who currently glue together a raw review scrape + sentiment notebook + dashboard.
  • Local SEO agencies running monthly client reports - drop the LLM markdown straight into the report template.
  • Restaurant groups / multi-location service businesses with response-rate / time-to-respond / recent-trend KPIs.
  • Competitive intel teams - competitor_delta answers "what do customers complain about us vs the competition?"
  • AI product builders feeding local-business data into LLM workflows who need pre-summarized markdown, not raw JSON.

FAQ

Q: Is this legal? A: Yes - Yelp reviews are public and we go through the upstream agents/yelp-reviews actor, which scrapes the public Yelp frontend (not the paid Yelp Fusion API). Use the data per Yelp's Terms of Service and applicable law.

Q: Why might a run fail? A: (1) Yelp's anti-bot blocks the session - the row comes back with available: false and a reason. RESIDENTIAL proxy is mandatory; datacenter IPs are rejected. (2) Business URL has no /biz/<slug> segment - upstream returns nothing. (3) Pushing max_reviews_per_business above 500 on many businesses at once triggers rate limits - lower concurrency or split the run.

Q: How fresh is the data? A: Live at crawl time. Reviews are read directly from the public Yelp page during the run. review_sort: newest returns most recent first - typically within minutes of being posted.

Q: Can I schedule this daily or weekly? A: Yes - weekly is the standard cadence for chain reputation monitoring. Daily for crisis-watch on a single high-volume business. Use Apify Schedules; combine with recent_trend to alert on declining flips.

Q: How do I push results into a CRM or BI tool? A: Two paths. (1) output_format: csv_friendly flattens reviews for direct import into BI dashboards (Looker, Power BI, Sheets). (2) output_format: llm_ready drops llm_summary_md straight into agent prompts or a client report template. Zapier/Make/n8n forward business-summary records to HubSpot, Salesforce, or a Slack channel on negative-trend alerts.

Q: How is this different from agents/yelp-reviews or tri_angle/yelp-review-scraper? A: Those are the upstream raw-scraper layer - they pull reviews and exit. This actor wraps that scrape and layers an intelligence pass on top: per-review sentiment with negation handling, 9-topic clustering with aggregated top_complaint_topics / top_praise_topics, owner-responder metrics (response rate, time-to-respond, reply sentiment), 12-month time-series with improving / flat / declining trend label, pairwise competitor delta, and LLM-ready markdown summaries. You are paying for the analysis layer, not the scrape - if all you need is raw reviews, use the upstream directly.

Q: How accurate is the sentiment classifier? A: ~85% on English consumer reviews against human labels in spot-check sets; lexicon-based with 2-token negation lookback. Non-English (es, fr, de) runs ~70-75%.

Q: How does PPE pricing actually work here? A: $0.010 per business_summary, $0.001 per review_record, $0.020 per competitor_delta_record, $0.005 per llm_summary. A 100-review business in JSON mode is about $0.11; a 5-business competitor delta with 100 reviews each is about $0.65.

  • Pair with any lead-finder actor (home-services-lead-finder, restaurants-lead-finder, salon-spa-lead-finder, etc.) - those build the lead list, this actor monitors each lead's Yelp reputation as a companion intelligence layer.
  • google-maps-reviews-pro - same intelligence layer applied to Google Maps reviews. Run both for cross-platform sentiment + topic + responder coverage.
  • reddit-topic-watcher - extend reputation monitoring beyond review platforms to Reddit complaint and praise threads.

Integrations

  • Zapier - push to HubSpot/Salesforce/Pipedrive/Apollo/Klaviyo
  • Make.com - workflow automation
  • n8n - self-hosted automation
  • Apify webhooks - POST to your endpoint
  • API + dataset export (JSON/CSV/Excel/XML)
  • MCP / AI agents - call from Claude/GPT/LangChain

Modes

ModeWhat it doesInputs
batch_analysisIndependent analysis of N businesses (up to 50).List of Yelp URLs or business IDs.
competitor_deltaPairwise comparison of 2-5 businesses.2-5 inputs.
single_business_deepMax depth on one business; bumps max_reviews to 500+.First input only.

Input

See .actor/INPUT_SCHEMA.json. Sample:

{
"mode": "batch_analysis",
"business_inputs": [
"https://www.yelp.com/biz/joes-pizza-new-york-9",
"prince-street-pizza-new-york"
],
"max_reviews_per_business": 100,
"review_sort": "newest",
"include_sentiment": true,
"include_topic_clustering": true,
"include_time_series": true,
"output_format": "json",
"apify_proxy_groups": ["RESIDENTIAL"],
"concurrency": 4
}

Output

One record per business with the analysis layer attached. Sample:

{
"record_type": "business_summary",
"business_name": "Joe's Pizza",
"yelp_url": "https://www.yelp.com/biz/joes-pizza-new-york-9",
"current_rating": 4.5,
"total_review_count": 8421,
"sentiment_distribution": {"positive_pct": 78.0, "negative_pct": 9.0},
"top_praise_topics": ["food_quality", "value_pricing"],
"top_complaint_topics": ["wait_time"],
"responder_metrics": {"response_rate": 12.0, "avg_response_time_days": 4.8},
"recent_trend": "improving",
"reviews": [
{
"rating": 5,
"text": "Best slice in NY",
"sentiment": "positive",
"sentiment_score": 0.82,
"review_topics": ["food_quality"],
"reviewer_is_elite": true,
"useful_count": 12,
"funny_count": 2,
"cool_count": 4
}
],
"available": true,
"scraped_at": "2026-05-16T12:00:00Z"
}

Pricing

Pay-per-event:

EventPriceWhen charged
business_summary$0.010Once per business successfully analyzed.
review_record$0.001Once per individual review extracted.
competitor_delta_record$0.020Once per pairwise comparison.
llm_summary$0.005Once per business when output_format=llm_ready.

Typical 100-review business in JSON mode: $0.11. 5-business competitor_delta with 100 reviews each: $0.65.

Save your input as an Apify Task

Apify Tasks let you save a configured input once and re-run it with a single click - no need to re-type search terms, locations, filters, or tier settings every time. Tasks are the foundation for everything that comes next: schedules, monitor mode, and webhook routing all attach to a saved Task, not to the raw actor.

Steps to save your current input as a Task:

  1. On this actor's Apify Store page, click Run with your input fully configured.
  2. Click the Save as task button at the top of the run page.
  3. Name the task something memorable (e.g. Reviews for top 10 competitors - weekly).
  4. Reload the task page and click Start anytime to re-run with the same inputs.

Tasks unlock the next two features below: scheduling and monitor mode.

Run this weekly with Apify Schedules

Apify Schedules cron-run any saved Task automatically. Pair this with the saved Task above and you get hands-off recurring runs with no manual clicks, no missed weeks, and a steady stream of fresh data into your CRM or warehouse.

Steps to schedule a Task:

  1. Save your input as a Task (see above).
  2. Go to https://console.apify.com/schedules and click Create new schedule.
  3. Pick your Task and set the cron expression. Common patterns:
    • Daily at 9am UTC: 0 9 * * *
    • Weekly on Mondays at 9am: 0 9 * * 1
    • Monthly on the 1st: 0 9 1 * *
  4. Save. Apify will run your Task on that schedule automatically, push the dataset to whatever integrations you have wired up, and fire run-completion webhooks for downstream automation.

Run weekly to track sentiment trends, catch negative reviews fast, and feed fresh review text into your VOC pipeline.

Monitor mode (v2, beta)

Monitor mode is the v2 evolution of this actor and is currently in BETA. It turns a recurring schedule into a true change-feed instead of a firehose of duplicate records.

How it works:

  • When this actor runs under an Apify Schedule, monitor mode is enabled automatically.
  • Instead of emitting ALL records every run, it emits ONLY records that are NEW or CHANGED since the last scheduled run.
  • A digest record summarizes the delta (X new, Y changed, Z removed) at the top of every run.
  • Optional: provide a Slack or email webhook URL in the monitor_webhook_url input field and the digest fires there too, so your team gets the delta in their inbox or channel without polling the dataset.
  • Cost: a single scheduled_delta_run event ($0.05) per scheduled run, plus standard PPE on emitted delta records only. Predictable monthly cost, no surprise bills from re-charging for unchanged records.

Monitor mode is rolling out to the top 3 actors first (this one included if it's hotel-motel-lead-finder, google-maps-reviews-pro, or mcp-accounting-firm-leads). Full portfolio coverage by end of June.

Support

Open an issue on the actor's GitHub or contact via Apify Store. Include the run ID and input config.

Changelog

See ./CHANGELOG.md.

Found this useful?

If this actor saved you time or money, please consider leaving a quick review on the Apify Store. Reviews help other buyers find work that solves their problem and let me prioritize the features paying customers actually use. Leave a review: https://apify.com/seibs.co/yelp-reviews-pro#reviews