# Yelp Scraper (`tutiny/yelp-scraper`) Actor

Scrape Yelp for business listings, reviews, photos, hours, contact info, ratings, and more. Supports keyword + location search with automatic pagination, proxy rotation, and rate limiting.

- **URL**: https://apify.com/tutiny/yelp-scraper.md
- **Developed by:** [Daniel](https://apify.com/tutiny) (community)
- **Categories:** E-commerce
- **Stats:** 2 total users, 1 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Yelp Scraper

Extract business listings, reviews, photos, hours, and contact information from Yelp at scale. Supports keyword + location search with automatic pagination, built-in rate limiting, Apify proxy rotation, and a Yelp Fusion API fallback for maximum reliability.

### What does Yelp Scraper do?

This Actor searches Yelp for businesses matching your keywords and location(s), then extracts complete details for each result including:

- **Business info** — name, address, phone, website, price range
- **Ratings & reviews** — star rating, review count, full review text, reviewer name, date
- **Categories** — all business categories listed on Yelp
- **Operating hours** — full weekly schedule
- **Photos** — direct URLs to all business photos
- **Coordinates** — latitude / longitude

### Use Cases

- Competitive intelligence & market research
- Lead generation for sales teams
- Local SEO analysis
- Aggregating restaurant/service reviews
- Building location-based datasets

### Input

| Field | Type | Default | Description |
|---|---|---|---|
| `searchQueries` | string[] | `["restaurants"]` | Keywords to search (e.g. `"car wash"`, `"dentist"`) |
| `locations` | string[] | `["Los Angeles, CA"]` | Locations to search |
| `maxResults` | integer | `50` | Max businesses per query+location combo |
| `scrapeReviews` | boolean | `true` | Also scrape reviews |
| `maxReviewsPerBusiness` | integer | `20` | Max reviews per business |
| `scrapePhotos` | boolean | `true` | Collect photo URLs |
| `scrapeHours` | boolean | `true` | Collect operating hours |
| `proxyConfiguration` | object | Apify Proxy | Proxy settings (residential recommended) |
| `rateLimitDelayMs` | integer | `2000` | Delay between requests (ms) |
| `maxConcurrency` | integer | `3` | Parallel browser sessions |
| `yelpFusionApiKey` | string | — | Optional Yelp API key for fallback |
| `startUrls` | URL[] | — | Advanced: directly provide Yelp URLs |

### Output

Each business is saved as a JSON object in the default Dataset:

```json
{
  "name": "Joe's Pizza",
  "url": "https://www.yelp.com/biz/joes-pizza-new-york-2",
  "phone": "(212) 366-1182",
  "address": "7 Carmine St, New York, NY 10014",
  "rating": 4.5,
  "reviewCount": 4821,
  "categories": ["Pizza", "Italian"],
  "priceRange": "$$",
  "website": "https://www.joespizzanyc.com",
  "hours": {
    "Mon": "10:00–04:00",
    "Tue": "10:00–04:00",
    "Wed": "10:00–04:00",
    "Thu": "10:00–04:00",
    "Fri": "10:00–05:00",
    "Sat": "10:00–05:00",
    "Sun": "10:00–04:00"
  },
  "photos": [
    "https://s3-media2.fl.yelpcdn.com/bphoto/xxx/o/photo1.jpg",
    "https://s3-media3.fl.yelpcdn.com/bphoto/yyy/o/photo2.jpg"
  ],
  "coordinates": { "lat": 40.7303, "lng": -74.0024 },
  "isClosed": false,
  "source": "scrape",
  "scrapedAt": "2024-06-01T12:00:00.000Z",
  "reviews": [
    {
      "userName": "Jane D.",
      "rating": 5,
      "date": "2024-05-15",
      "reviewText": "Best pizza in NYC, hands down. The crust is perfect.",
      "source": "dom"
    }
  ]
}
````

### Anti-blocking & Reliability

- **Residential proxy rotation** via Apify Proxy (recommended)
- **Stealth Playwright** — headless Chromium with `webdriver` fingerprint masking
- **Configurable rate limiting** — default 2 second delay between requests
- **Automatic retries** — up to 3 retries with exponential back-off
- **Yelp Fusion API fallback** — if scraping fails, falls back to the official API (requires free API key)
- **State persistence** — survives Actor migrations/restarts

### Performance & Cost

| Setup | Speed | Cost estimate |
|---|---|---|
| Residential proxies | ~50 biz/min | ~$5 per 1,000 businesses |
| Datacenter proxies | ~100 biz/min | ~$2 per 1,000 businesses |
| No proxy (testing) | ~30 biz/min | Free (may get blocked) |

### Getting a Yelp Fusion API Key (optional)

1. Go to https://www.yelp.com/developers
2. Create a free app
3. Copy the API key into the `yelpFusionApiKey` input field

Note: The free Fusion API tier returns a maximum of 3 reviews per business and 1,000 calls/day. The scraper does not have these limitations.

### Legal & Terms of Service

This Actor is provided for educational and research purposes. Always review Yelp's [Terms of Service](https://www.yelp.com/static?p=tos) and robots.txt before scraping. Use responsibly and respect rate limits.

# Actor input Schema

## `searchQueries` (type: `array`):

List of search keywords (e.g. 'restaurants', 'car wash', 'dentist').

## `locations` (type: `array`):

List of locations to search (e.g. 'Los Angeles, CA', 'New York, NY').

## `maxResults` (type: `integer`):

Maximum number of businesses to extract per search combination. Set 0 for unlimited.

## `scrapeReviews` (type: `boolean`):

Whether to also scrape reviews for each business.

## `maxReviewsPerBusiness` (type: `integer`):

Maximum number of reviews to scrape per business. Set 0 for unlimited.

## `scrapePhotos` (type: `boolean`):

Whether to collect photo URLs for each business.

## `scrapeHours` (type: `boolean`):

Whether to scrape opening hours for each business.

## `proxyConfiguration` (type: `object`):

Apify Proxy configuration. Residential proxies are recommended for Yelp.

## `requestHandlerTimeoutSecs` (type: `integer`):

Timeout in seconds for each page/request.

## `maxConcurrency` (type: `integer`):

Maximum number of parallel browser/HTTP sessions.

## `rateLimitDelayMs` (type: `integer`):

Milliseconds to wait between requests to avoid being blocked.

## `yelpFusionApiKey` (type: `string`):

Optional. If provided, used as a fallback when direct scraping fails. Get one free at https://www.yelp.com/developers.

## `startUrls` (type: `array`):

Advanced: directly provide Yelp search or business URLs to scrape.

## Actor input object example

```json
{
  "searchQueries": [
    "restaurants"
  ],
  "locations": [
    "Los Angeles, CA"
  ],
  "maxResults": 50,
  "scrapeReviews": true,
  "maxReviewsPerBusiness": 20,
  "scrapePhotos": true,
  "scrapeHours": true,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  },
  "requestHandlerTimeoutSecs": 60,
  "maxConcurrency": 3,
  "rateLimitDelayMs": 2000,
  "startUrls": []
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQueries": [
        "restaurants"
    ],
    "locations": [
        "Los Angeles, CA"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ]
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("tutiny/yelp-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchQueries": ["restaurants"],
    "locations": ["Los Angeles, CA"],
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("tutiny/yelp-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQueries": [
    "restaurants"
  ],
  "locations": [
    "Los Angeles, CA"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  }
}' |
apify call tutiny/yelp-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=tutiny/yelp-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Yelp Scraper",
        "description": "Scrape Yelp for business listings, reviews, photos, hours, contact info, ratings, and more. Supports keyword + location search with automatic pagination, proxy rotation, and rate limiting.",
        "version": "1.0",
        "x-build-id": "2ddTEXgp78XYCQ4bw"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/tutiny~yelp-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-tutiny-yelp-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/tutiny~yelp-scraper/runs": {
            "post": {
                "operationId": "runs-sync-tutiny-yelp-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/tutiny~yelp-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-tutiny-yelp-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchQueries": {
                        "title": "Search Keywords",
                        "type": "array",
                        "description": "List of search keywords (e.g. 'restaurants', 'car wash', 'dentist').",
                        "default": [
                            "restaurants"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "locations": {
                        "title": "Locations",
                        "type": "array",
                        "description": "List of locations to search (e.g. 'Los Angeles, CA', 'New York, NY').",
                        "default": [
                            "Los Angeles, CA"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxResults": {
                        "title": "Max Results (per query+location combo)",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of businesses to extract per search combination. Set 0 for unlimited.",
                        "default": 50
                    },
                    "scrapeReviews": {
                        "title": "Scrape Reviews",
                        "type": "boolean",
                        "description": "Whether to also scrape reviews for each business.",
                        "default": true
                    },
                    "maxReviewsPerBusiness": {
                        "title": "Max Reviews per Business",
                        "minimum": 0,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of reviews to scrape per business. Set 0 for unlimited.",
                        "default": 20
                    },
                    "scrapePhotos": {
                        "title": "Scrape Photo URLs",
                        "type": "boolean",
                        "description": "Whether to collect photo URLs for each business.",
                        "default": true
                    },
                    "scrapeHours": {
                        "title": "Scrape Business Hours",
                        "type": "boolean",
                        "description": "Whether to scrape opening hours for each business.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Apify Proxy configuration. Residential proxies are recommended for Yelp.",
                        "default": {
                            "useApifyProxy": true
                        }
                    },
                    "requestHandlerTimeoutSecs": {
                        "title": "Request Timeout (seconds)",
                        "minimum": 10,
                        "maximum": 300,
                        "type": "integer",
                        "description": "Timeout in seconds for each page/request.",
                        "default": 60
                    },
                    "maxConcurrency": {
                        "title": "Max Concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "Maximum number of parallel browser/HTTP sessions.",
                        "default": 3
                    },
                    "rateLimitDelayMs": {
                        "title": "Rate Limit Delay (ms)",
                        "minimum": 500,
                        "maximum": 30000,
                        "type": "integer",
                        "description": "Milliseconds to wait between requests to avoid being blocked.",
                        "default": 2000
                    },
                    "yelpFusionApiKey": {
                        "title": "Yelp Fusion API Key (optional fallback)",
                        "type": "string",
                        "description": "Optional. If provided, used as a fallback when direct scraping fails. Get one free at https://www.yelp.com/developers."
                    },
                    "startUrls": {
                        "title": "Start URLs (advanced)",
                        "type": "array",
                        "description": "Advanced: directly provide Yelp search or business URLs to scrape.",
                        "default": [],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
