NYT Cooking Recipe Scraper avatar

NYT Cooking Recipe Scraper

Pricing

Pay per event

Go to Apify Store
NYT Cooking Recipe Scraper

NYT Cooking Recipe Scraper

Enumerate all ~25K public NYT Cooking recipes from the official sitemap and extract structured recipe data (ingredients, instructions, nutrition, ratings) from schema.org Recipe JSON-LD.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Enumerate the complete NYT Cooking recipe catalog (~25K recipes) from the official sitemap and extract structured recipe data from the public schema.org Recipe JSON-LD embedded in each page.

What it collects

Every record contains the following fields:

FieldTypeDescription
recipe_idstringUnique NYT Cooking recipe identifier
urlstringCanonical recipe URL
namestringRecipe title
authorstringNYT Cooking contributor byline
descriptionstringRecipe description / headnote
recipe_yieldstringServing size (e.g. "4 servings")
total_timestringTotal cooking time (e.g. "1 hr 30 min")
prep_timestringPreparation time
cook_timestringActive cooking time
recipe_categorystringMeal category (e.g. "Dinner, Main Course")
recipe_cuisinestringCuisine style (e.g. "Mediterranean Inspired")
recipe_ingredientarrayList of ingredient strings with quantities
recipe_instructionsarrayStep-by-step instructions
nutritionstringJSON-serialized nutrition facts (calories, fat, carbs, protein, sodium, etc.) from schema.org NutritionInformation. null for recipes without nutrition data.
aggregate_ratingnumberAverage user rating (1–5 scale)
rating_countintegerNumber of user ratings
keywordsarrayTags and keywords (ingredient highlights, technique, difficulty, etc.)
image_urlsarrayFull-resolution image URLs
date_publishedstringISO 8601 publication date

Discovery

By default the actor walks the official NYT Cooking sitemap index (https://www.nytimes.com/sitemaps/new/cooking.xml.gz), which contains monthly sub-sitemaps covering the full recipe inventory. Only /recipes/ paths are collected — article and guide pages are excluded.

Inputs

InputTypeDefaultDescription
maxItemsinteger10Maximum number of recipes to collect. Set to 0 for no limit (full catalog run).
startUrlsarrayOptional list of specific NYT Cooking recipe URLs to scrape directly, bypassing sitemap discovery. Useful for targeted single-recipe or small-batch runs.

Data source

All data is extracted from the schema.org/Recipe JSON-LD markup that NYT Cooking embeds in every public recipe page for SEO purposes. Recipe content — including ingredients, instructions, and metadata — is publicly available. The NYT Cooking paywall only gates account-specific features (recipe box, personal notes, collections) and does not restrict access to recipe markup.

Usage notes

  • For a full catalog run (~25K recipes), use maxItems: 0 and allow sufficient run time.
  • Nutrition data (nutrition field) is present on most recipes but absent on some recently published ones; the field is null in those cases.
  • The sitemap updates frequently (new recipes appear within hours of publication). Re-running with maxItems: 0 against the latest sub-sitemaps will catch additions.