Yelp Business Info Scraper
Pricing
$24.99/month + usage
Yelp Business Info Scraper
Extract detailed business information from Yelp with the Yelp Business Info Scraper. Collect business names, ratings, reviews count, phone numbers, addresses, categories, price ranges, and more. Perfect for market research, lead generation, competitive analysis, and local business insights.
Pricing
$24.99/month + usage
Rating
0.0
(0)
Developer

Scrapier
Actor stats
0
Bookmarked
3
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Scrape detailed business information from Yelp business pages at scale. This Apify Actor extracts title, rating, reviews, address, phone, hours, images, categories, services, and more from one or many Yelp URLs.
Why Choose Us?
- Bulk URLs – Process many Yelp business URLs in one run.
- Proxy fallback – Starts without proxy; if Yelp blocks, automatically falls back to datacenter then residential proxy and sticks with residential for remaining requests.
- Structured output – Same schema every time: title, rating, address, hours, businessServices, images, etc.
- Live saving – Results are pushed to the dataset as they’re scraped, so partial data is saved even if the run stops early.
Key Features
- No proxy by default – Sends requests directly to Yelp unless you enable proxy or the actor falls back after blocks.
- Automatic proxy fallback – On 403 or captcha: try datacenter proxy, then residential (with up to 3 retries), then stick with residential for all remaining requests.
- Clear proxy logging – Logs when using no proxy, datacenter, or residential, and when fallbacks occur.
- Retries – Up to 3 retries per request with delays to reduce blocking.
- Stealth – Playwright with anti-detection and human-like behavior.
Input
| Field | Type | Required | Description |
|---|---|---|---|
| startUrls | array | Yes | List of Yelp business page URLs (e.g. https://www.yelp.com/biz/business-name-city-1). Supports bulk input. |
| proxyConfiguration | object | No | Proxy settings. Default: no proxy. If Yelp blocks, the actor will fall back to datacenter then residential. |
Example input (JSON):
{"startUrls": [{ "url": "https://www.yelp.com/biz/dandelion-cafe-houston-3" },{ "url": "https://www.yelp.com/biz/credence-houston-3" }],"proxyConfiguration": {"useApifyProxy": false}}
Output
The Actor pushes one object per business to the default dataset. Each item matches the following structure (same as the provided output format):
| Field | Description |
|---|---|
| title | Business name |
| rating | Star rating (e.g. "4.0") |
| reviewCount | e.g. "99 reviews" |
| isClaimed | e.g. "Claimed" or empty |
| priceLevel | e.g. "$", "$$" |
| categories | Comma-separated categories |
| fullAddress, city, state, zipcode | Address fields |
| phoneNumber | Formatted phone |
| images | Array of image URLs |
| website | Business website URL |
| hours | Object with day names and time ranges |
| businessOwnerName, about | Owner and description |
| reviewhighlights | Array of highlight data |
| businessServices | Object of service names to boolean |
| yelp_biz_id | Yelp internal business ID |
| timestamp | When the record was scraped |
| url | Yelp page URL |
| source_url | Same as URL (source page) |
| is_page_not_found | Boolean |
| status | e.g. "SUCCEEDED" |
Example output item:
{"title": "Dandelion Cafe","rating": "4.0","reviewCount": "99 reviews","city": "Houston","state": "TX","phoneNumber": "(832) 888-1568","url": "https://www.yelp.com/biz/dandelion-cafe-houston-3","yelp_biz_id": "h4UA0ul9Y3grjjQRBvcgXQ","status": "SUCCEEDED","source_url": "https://www.yelp.com/biz/dandelion-cafe-houston-3"}
How to Use the Actor (via Apify Console)
- Log in at https://console.apify.com and go to Actors.
- Find Yelp Business Info Scraper and open it.
- Configure Input:
- startUrls: Add one or more Yelp business page URLs.
- proxyConfiguration: Leave default (no proxy) or enable Apify proxy; the actor will still start without proxy and fall back if Yelp blocks.
- Click Start.
- Watch Log for progress and proxy messages (no proxy → datacenter → residential).
- Open the Output tab to see the dataset.
- Export to JSON or CSV as needed.
Run via API (cURL)
curl -X POST \-H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_APIFY_TOKEN" \-d '{"startUrls":[{"url":"https://www.yelp.com/biz/dandelion-cafe-houston-3"}],"proxyConfiguration":{"useApifyProxy":false}}' \"https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs?token=YOUR_APIFY_TOKEN"
Best Use Cases
- Building local business databases from Yelp.
- Enriching leads with address, phone, hours, and categories.
- Comparing businesses by rating, reviews, and services.
- One-off or scheduled scraping of many Yelp listings.
Frequently Asked Questions
Does the actor use a proxy by default?
No. It starts with no proxy. If Yelp returns 403 or a captcha page, it switches to datacenter proxy, then to residential proxy (with up to 3 retries), and then keeps using residential for the rest of the run.
Can I run it with proxy from the start?
You can enable proxy in the input. The actor still begins by sending requests without proxy and only uses proxy after a block, so behavior stays consistent with the fallback logic.
Why did some URLs fail?
Failures can be due to rate limiting, captcha, or invalid/removed pages. Check the log for [PROXY] and retry messages. For persistent blocks, the actor will have already tried datacenter and residential fallback.
Support and Feedback
Use the Apify platform support or the actor’s repository for bugs and feature requests.
Cautions
- Data is collected only from publicly available Yelp pages.
- Do not scrape private or password-protected content.
- You are responsible for complying with applicable laws (e.g. privacy, data protection, and terms of use).