Food.com Recipe Scraper
Pricing
Pay per event
Food.com Recipe Scraper
Scrape recipes from Food.com — one of the largest English community recipe databases with 500K+ recipes. Enumerate the full sitemap or supply specific URLs. Extracts ingredients, instructions, nutrition, ratings, reviews, and tag taxonomy.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Scrape recipes from Food.com — one of the largest English-language community recipe databases with over 500,000 recipes including ratings, reviews, and a rich tag taxonomy.
What it does
This actor enumerates Food.com's full sitemap (or accepts a direct list of recipe URLs) and extracts structured recipe data from each page. All core fields come from the embedded schema.org/Recipe JSON-LD block, supplemented with DOM extraction for Food.com-specific data (tag taxonomy, rating details, image gallery).
Use cases
- Build recommender training datasets (ratings + review counts at 500K+ scale)
- Meal-plan and recipe-app databases
- Food trend analytics and NLP corpora
- RAG pipelines for culinary applications
- Competitive ingredient and nutritional analysis
Input
| Field | Type | Description |
|---|---|---|
maxItems | integer | Maximum number of recipes to scrape. Set to 0 for the full ~500K corpus. Default: 10 |
recipeUrls | array | Optional list of specific Food.com recipe URLs to scrape. If provided, sitemap enumeration is skipped. |
Example: specific URLs
{"maxItems": 5,"recipeUrls": [{ "url": "https://www.food.com/recipe/jo-mamas-world-famous-spaghetti-22782" },{ "url": "https://www.food.com/recipe/easy-homemade-chicken-soup-157877" }]}
Example: sitemap enumeration (first 1000 recipes)
{"maxItems": 1000}
Output
Each record in the dataset corresponds to one recipe:
| Field | Type | Description |
|---|---|---|
recipe_id | string | Unique numeric recipe ID from the URL |
url | string | Canonical recipe URL |
name | string | Recipe name |
author | string | Recipe author username |
description | string | Full recipe description |
recipe_category | string | Primary category (e.g. Dessert, Main Dish) |
recipe_cuisine | string | Cuisine type if specified (e.g. Italian) |
prep_time | string | Preparation time in ISO 8601 format (e.g. PT15M) |
cook_time | string | Cook time in ISO 8601 format |
total_time | string | Total time in ISO 8601 format |
recipe_yield | string | Servings (e.g. "4 serving(s)") |
recipe_ingredient | array | Ingredients as formatted strings |
recipe_instructions | array | Step-by-step instruction strings |
nutrition | object | Nutritional data: calories, fat_content, saturated_fat, cholesterol, sodium, carbohydrate, fiber, sugar, protein |
aggregate_rating | number | Average star rating (0-5 scale) |
rating_count | integer | Total number of ratings |
review_count | integer | Total number of written reviews |
keywords | string | Comma-separated keywords (occasion, diet, method tags) |
tags | array | Food.com topic taxonomy tags |
image_urls | array | Recipe photo URLs |
date_published | string | Publication date (ISO 8601) |
Sample output record
{"recipe_id": "22782","url": "https://www.food.com/recipe/jo-mamas-world-famous-spaghetti-22782","name": "Jo Mama's World Famous Spaghetti","author": "Sharlene~W","description": "My kids will give up a steak dinner for this spaghetti...","recipe_category": "Spaghetti","recipe_cuisine": null,"prep_time": "PT20M","cook_time": "PT1H","total_time": "PT1H20M","recipe_yield": "4 quarts, 10-14 serving(s)","recipe_ingredient": ["2 lbs Italian sausage, casings removed", "..."],"recipe_instructions": ["In large, heavy stockpot, brown Italian sausage...", "..."],"nutrition": {"calories": "555.9","fat_content": "26.3","protein": "29.8"},"aggregate_rating": 5.0,"rating_count": 1376,"review_count": 1376,"keywords": "Pork,Meat,European,Kid Friendly,Weeknight,Stove Top,< 4 Hours,Easy","tags": ["Spaghetti"],"image_urls": ["https://img.sndimg.com/food/image/upload/..."],"date_published": "2002-03-17T10:26Z"}
Crawl approach
- Sitemap enumeration: Fetches
https://www.food.com/sitemap.xml(a 24-child sitemap index with gzip-compressed child files) and collects all/recipe/URLs. - Page scraping: Each recipe page is fetched and parsed via the embedded
schema.org/RecipeJSON-LD block for structured data, plus DOM extraction for Food.com-specific taxonomy and image gallery. - Rate limiting: Automatic rate-limit detection and backoff — no manual configuration needed.
Performance
- Memory: 512 MB
- No proxy required — Food.com datacenter access is open
- Concurrency: 10 parallel requests
- Full corpus (~500K recipes): runs over the default 4-hour timeout