Recipe Scraper (Universal / schema.org) avatar

Recipe Scraper (Universal / schema.org)

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Recipe Scraper (Universal / schema.org)

Recipe Scraper (Universal / schema.org)

Scrape any schema.org-compliant recipe site like Epicurious, BBC Good Food, Tasty, NYT Cooking, Serious Eats, Food Network, plus thousands of food blogs. Extracts ingredients, instructions, nutrition, ratings, prep/cook time, yield, author, and images via JSON-LD parsing.

Pricing

from $3.00 / 1,000 results

Rating

5.0

(21)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

21

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Scrape any schema.org-compliant recipe site. Extracts ingredients, step-by-step instructions, nutrition, ratings, prep + cook + total time, yield, author, images, and video URLs from a clean schema.org/Recipe JSON-LD parse.

Works on Epicurious, Tasty, BBC Good Food, NYT Cooking, Bon Appétit, Serious Eats, King Arthur Baking, Food Network, Smitten Kitchen, Budget Bytes, and thousands of food blogs that use the standard recipe schema (most WordPress recipe plugins emit it).

AllRecipes is not supported. It is fronted by Akamai Bot Manager which blocks both datacenter and residential IPs. AllRecipes URLs are rejected upfront with a typed url_failed (reason: "unsupported_site") record. Use any of the supported sites above instead.

What you get

Recipe records (recordType=recipe)

FieldDescription
urlCanonical recipe URL
idSame as url
platformSite slug (epicurious, tasty, bbcgoodfood, nytcooking, bonappetit, seriouseats, …)
nameRecipe title
descriptionShort blurb (HTML stripped)
imageHero image URL
author{name, [url]}
ratingValueAverage rating 0-5
ratingCountNumber of ratings
reviewCountNumber of written reviews
prepTimeMinutesPrep time in minutes
cookTimeMinutesCook time in minutes
totalTimeMinutesTotal time (uses totalTime if present, else prep + cook)
recipeYieldServings / pieces (e.g. 8 slices)
recipeCategoryArray (e.g. ["Dessert"])
recipeCuisineArray (e.g. ["American"])
keywordsArray of free-form keyword tags
recipeIngredientArray of ingredient strings
recipeInstructionsArray of step strings
nutrition{calories, proteinContent, carbohydrateContent, fatContent, sodiumContent, …}
video{url, name, thumbnailUrl} if present
datePublishedISO publish date
dateModifiedISO last-modified date
scrapedAtISO 8601 UTC timestamp

Empty fields are dropped from every record at every depth.

Input

ParameterTypeDefaultDescription
modeEnumbyUrlsbyUrls / byTag / bySitemap
recipeUrlsArray["https://www.epicurious.com/recipes/food/views/banana-bread"]Recipe URLs (mode=byUrls)
tagUrlsArrayEpicurious tag URLs or slugs (mode=byTag) — e.g. dessert, /ingredient/banana, full category URL
sitemapUrlsArraySitemap.xml URLs from any recipe site (mode=bySitemap)
minRatingIntegerDrop recipes with ratingValue below this. Scale 0-500 (e.g. 400 = 4.0/5)
minRatingCountIntegerDrop recipes with fewer ratings than this
maxTotalTimeMinutesIntegerDrop recipes with total time above this
keywordIncludesStringDrop recipes whose name + description don't include this keyword

Example input — by tag (Epicurious)

{
"mode": "byTag",
"tagUrls": ["main-course", "/ingredient/banana"],
"maxItems": 30
}

Example input — by sitemap (BBC Good Food)

{
"mode": "bySitemap",
"sitemapUrls": ["https://www.bbcgoodfood.com/sitemaps/2026-Q2-recipe.xml"],
"maxItems": 50
}

| proxy | Object | RESIDENTIAL | Optional; supported sites work without proxy too | | maxItems | Integer | 50 | Hard cap on emitted records (1-1000) |

Example input — single Epicurious recipe (no proxy needed)

{
"recipeUrls": ["https://www.epicurious.com/recipes/food/views/banana-bread"]
}

Example input — bulk recipe URLs across multiple sites

{
"recipeUrls": [
"https://tasty.co/recipe/banana-bread",
"https://www.bbcgoodfood.com/recipes/banana-bread",
"https://cooking.nytimes.com/recipes/12166-banana-bread",
"https://www.epicurious.com/recipes/food/views/banana-bread"
],
"minRating": 400,
"maxTotalTimeMinutes": 90
}

Example input — keyword filter

{
"recipeUrls": ["https://tasty.co/recipe/banana-bread"],
"keywordIncludes": "banana",
"minRatingCount": 100
}

Example output

{
"recordType": "recipe",
"url": "https://www.epicurious.com/recipes/food/views/banana-bread",
"platform": "epicurious",
"name": "Banana Bread With Variations",
"description": "Use this versatile banana bread recipe as a base for fun mix-ins...",
"image": "https://assets.epicurious.com/photos/.../banana-bread.jpg",
"author": { "name": "Epicurious Editors" },
"ratingValue": 4.5,
"ratingCount": 423,
"prepTimeMinutes": 15,
"cookTimeMinutes": 60,
"totalTimeMinutes": 75,
"recipeYield": "1 loaf",
"recipeCategory": ["dessert"],
"recipeCuisine": ["American"],
"keywords": ["banana", "bread", "baking"],
"recipeIngredient": [
"1¾ cups all-purpose flour",
"3 large overripe bananas, mashed",
"..."
],
"recipeInstructions": [
"Preheat oven to 350°F.",
"In a bowl, mash the bananas...",
"..."
],
"nutrition": {
"calories": "320",
"carbohydrateContent": "55g",
"proteinContent": "5g"
},
"datePublished": "2009-03-30",
"scrapedAt": "2026-05-06T10:42:18Z"
}

Use cases

  • Recipe SEO / content audits — Pull schema.org Recipe data across competitor sites for content gap analysis.
  • Meal-planning apps — Build recipe libraries by ingesting curated URL lists.
  • Nutrition trackers — Standardize ingredient + nutrition data across sources.
  • AI training data — Construct labelled recipe datasets for ML pipelines (cooking instruction generation, ingredient extraction).
  • Aggregator backends — Power recipe-bookmarking apps that index user-submitted URLs.

FAQ

Why isn't AllRecipes supported? AllRecipes is fronted by Akamai Bot Manager which blocks both datacenter and residential IP ranges aggressively. Every known bypass currently fails. AllRecipes URLs are rejected upfront with a typed url_failed (reason: "unsupported_site") record so users see clear feedback instead of silent failures or fake data.

Which sites are confirmed to work?

  • Epicurious — works without proxy
  • Tasty (BuzzFeed) — works without proxy
  • BBC Good Food — works without proxy
  • NYT Cooking — works without proxy (paywall content blocked, public recipes work)
  • Bon Appétit, Serious Eats, Food Network, King Arthur Baking — work, residential proxy is the safe default
  • Most WordPress food blogs — work without proxy (Smitten Kitchen, Budget Bytes, Pinch of Yum, Minimalist Baker, etc.)
  • AllRecipes — not supported (rejected upfront)

What if a URL has no schema.org/Recipe JSON-LD? The actor emits a url_failed record with reason: "no_recipe_jsonld". Most modern recipe sites embed it; sites that don't (older blogs, sites with custom layouts) won't yield data.

Do I get instructions as a single block or step-by-step? Step-by-step. The actor parses HowToStep and HowToSection schema types into a flat array of step strings. HTML formatting is stripped.

What if the page returns blocked / 403? The actor retries with exponential backoff. URLs that 403 emit a url_failed record with reason: "anti_bot_block".

How current is the data? Live — every run hits the recipe site at request time. Schedule the actor for daily / weekly refreshes to track rating drift on a recipe portfolio.

Limitations

  • The actor reads schema.org/Recipe JSON-LD only. Sites that don't embed it (or use a custom microformat) won't yield data.
  • AllRecipes specifically requires residential proxy; without one you'll get a recipe_blocked sentinel.
  • Some sites strip nutrition data or vary on prepTime / cookTime; missing fields are simply omitted.
  • Per-comment / per-review data isn't included (only aggregate ratingCount / reviewCount).
  • Video transcripts / step-by-step photos are not captured.