# Recipe Scraper — Extract Recipes from 100+ Cooking Websites (`studio-amba/recipe-scraper`) Actor

Scrape recipes with ingredients, instructions, nutrition, ratings, and cooking times from popular recipe websites. Supports allrecipes.com, bbcgoodfood.com, and any site with Schema.org Recipe markup.

- **URL**: https://apify.com/studio-amba/recipe-scraper.md
- **Developed by:** [Studio Amba](https://apify.com/studio-amba) (community)
- **Categories:** E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Recipe Scraper

Extract recipes with ingredients, instructions, nutrition info, ratings, and cooking times from 100+ cooking websites using Schema.org structured data.

### Why use this actor?

Need recipe data at scale? This scraper works with any website that uses Schema.org Recipe markup -- which includes virtually every major cooking site. Get clean, structured recipe data for meal planning apps, nutrition analysis, recipe aggregators, or content research. No site-specific configuration needed.

### Supported sites

Works out of the box with **any site using Schema.org Recipe markup**, including:

- allrecipes.com
- bbcgoodfood.com
- food.com
- simplyrecipes.com
- delish.com
- epicurious.com
- bonappetit.com
- tasty.co
- foodnetwork.com
- seriouseats.com
- smittenkitchen.com
- budgetbytes.com
- Any WordPress blog with recipe plugins (WP Recipe Maker, Tasty Recipes, etc.)

### Input

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `startUrls` | Array | No | Recipe page URLs or search result pages to scrape |
| `searchQuery` | String | No | Search for recipes by keyword (e.g., "pasta carbonara") |
| `maxResults` | Integer | No | Maximum recipes to return (default: 100) |
| `proxyConfiguration` | Object | No | Proxy settings (recommended for protected sites like allrecipes.com) |

### Output

Each result contains:

| Field | Type | Example |
|-------|------|---------|
| `name` | String | `"Chicken Piccata"` |
| `description` | String | `"A quick and easy Italian classic..."` |
| `author` | String | `"Chef John"` |
| `prepTime` | String | `"15min"` |
| `cookTime` | String | `"20min"` |
| `totalTime` | String | `"35min"` |
| `servings` | Number | `4` |
| `calories` | Number | `350` |
| `ingredients` | Array | `["4 chicken breasts", "1/2 cup flour", ...]` |
| `instructions` | Array | `["Pound chicken to even thickness...", ...]` |
| `cuisine` | String | `"Italian"` |
| `category` | String | `"Main Course"` |
| `rating` | Number | `4.8` |
| `reviewCount` | Number | `342` |
| `keywords` | Array | `["chicken", "piccata", "lemon"]` |
| `nutrition` | Object | `{"Calories": "350 calories", "Protein": "35g", ...}` |
| `imageUrl` | String | Full image URL |
| `url` | String | Source recipe URL |
| `scrapedAt` | String | ISO timestamp |

### Example output

```json
{
    "name": "Classic Banana Bread",
    "url": "https://www.allrecipes.com/recipe/20144/banana-banana-bread/",
    "scrapedAt": "2026-04-03T12:00:00.000Z",
    "description": "This is the best banana bread recipe. It's easy to make and turns out perfectly moist every time.",
    "author": "Shelley Albeluhn",
    "imageUrl": "https://www.allrecipes.com/thmb/...",
    "prepTime": "15min",
    "cookTime": "1h 5min",
    "totalTime": "1h 20min",
    "servings": 12,
    "calories": 232,
    "ingredients": [
        "2 cups all-purpose flour",
        "1 teaspoon baking soda",
        "1/4 teaspoon salt",
        "3 bananas, mashed",
        "3/4 cup white sugar",
        "1 egg, beaten",
        "1/3 cup melted butter"
    ],
    "instructions": [
        "Preheat oven to 350 degrees F (175 degrees C). Lightly grease a 9x5-inch loaf pan.",
        "Combine flour, baking soda, and salt in a large bowl.",
        "Mix bananas, sugar, egg, and melted butter in a separate bowl.",
        "Stir banana mixture into flour mixture until just combined.",
        "Pour batter into prepared loaf pan.",
        "Bake for 60 to 65 minutes, until a toothpick inserted into center comes out clean."
    ],
    "cuisine": "American",
    "category": "Bread",
    "rating": 4.7,
    "reviewCount": 12847,
    "keywords": ["banana bread", "quick bread", "baking"],
    "nutrition": {
        "Calories": "232 calories",
        "Fat": "7g",
        "Carbohydrates": "40g",
        "Protein": "3g"
    }
}
````

### How it works

1. **JSON-LD extraction** (primary) -- Parses `<script type="application/ld+json">` tags for Schema.org Recipe markup. This is the most reliable method and works on 95%+ of recipe sites.
2. **HTML fallback** -- When JSON-LD is not available, falls back to parsing common HTML selectors and meta tags.
3. **Search support** -- Automatically discovers recipe links from search result pages and follows them.

### Cost estimate

This actor uses approximately **0.005 compute units per recipe** (Cheerio-based, no browser needed). At standard Apify pricing, that's roughly **$0.25 per 1,000 recipes**.

### Limitations

- Some sites (allrecipes.com, bbcgoodfood.com) have anti-bot protection -- use proxy configuration for best results
- HTML fallback produces less complete data than JSON-LD extraction
- Search pagination support varies by site
- Data is scraped from public websites and may change without notice

# Actor input Schema

## `startUrls` (type: `array`):

URLs of recipe pages or search result pages to scrape. Works with allrecipes.com, bbcgoodfood.com, food.com, and any site with Schema.org Recipe markup.

## `searchQuery` (type: `string`):

Search for recipes by keyword (e.g., 'pasta carbonara', 'vegan curry'). Searches allrecipes.com by default.

## `maxResults` (type: `integer`):

Maximum number of recipes to return.

## `proxyConfiguration` (type: `object`):

Proxy settings. Recommended for allrecipes.com and other protected sites.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.allrecipes.com/recipe/20144/banana-banana-bread/"
    }
  ],
  "searchQuery": "chicken pasta",
  "maxResults": 100
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.allrecipes.com/recipe/20144/banana-banana-bread/"
        }
    ],
    "searchQuery": "chicken pasta"
};

// Run the Actor and wait for it to finish
const run = await client.actor("studio-amba/recipe-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.allrecipes.com/recipe/20144/banana-banana-bread/" }],
    "searchQuery": "chicken pasta",
}

# Run the Actor and wait for it to finish
run = client.actor("studio-amba/recipe-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.allrecipes.com/recipe/20144/banana-banana-bread/"
    }
  ],
  "searchQuery": "chicken pasta"
}' |
apify call studio-amba/recipe-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=studio-amba/recipe-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Recipe Scraper — Extract Recipes from 100+ Cooking Websites",
        "description": "Scrape recipes with ingredients, instructions, nutrition, ratings, and cooking times from popular recipe websites. Supports allrecipes.com, bbcgoodfood.com, and any site with Schema.org Recipe markup.",
        "version": "0.1",
        "x-build-id": "6uS4fjJ2z50Fn31mA"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/studio-amba~recipe-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-studio-amba-recipe-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/studio-amba~recipe-scraper/runs": {
            "post": {
                "operationId": "runs-sync-studio-amba-recipe-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/studio-amba~recipe-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-studio-amba-recipe-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Recipe URLs",
                        "type": "array",
                        "description": "URLs of recipe pages or search result pages to scrape. Works with allrecipes.com, bbcgoodfood.com, food.com, and any site with Schema.org Recipe markup.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search for recipes by keyword (e.g., 'pasta carbonara', 'vegan curry'). Searches allrecipes.com by default."
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of recipes to return.",
                        "default": 100
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings. Recommended for allrecipes.com and other protected sites."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
