NYT Cooking Recipe Scraper
Pricing
Pay per event
NYT Cooking Recipe Scraper
Enumerate all ~25K public NYT Cooking recipes from the official sitemap and extract structured recipe data (ingredients, instructions, nutrition, ratings) from schema.org Recipe JSON-LD.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Enumerate the complete NYT Cooking recipe catalog (~25K recipes) from the official sitemap and extract structured recipe data from the public schema.org Recipe JSON-LD embedded in each page.
What it collects
Every record contains the following fields:
| Field | Type | Description |
|---|---|---|
recipe_id | string | Unique NYT Cooking recipe identifier |
url | string | Canonical recipe URL |
name | string | Recipe title |
author | string | NYT Cooking contributor byline |
description | string | Recipe description / headnote |
recipe_yield | string | Serving size (e.g. "4 servings") |
total_time | string | Total cooking time (e.g. "1 hr 30 min") |
prep_time | string | Preparation time |
cook_time | string | Active cooking time |
recipe_category | string | Meal category (e.g. "Dinner, Main Course") |
recipe_cuisine | string | Cuisine style (e.g. "Mediterranean Inspired") |
recipe_ingredient | array | List of ingredient strings with quantities |
recipe_instructions | array | Step-by-step instructions |
nutrition | string | JSON-serialized nutrition facts (calories, fat, carbs, protein, sodium, etc.) from schema.org NutritionInformation. null for recipes without nutrition data. |
aggregate_rating | number | Average user rating (1–5 scale) |
rating_count | integer | Number of user ratings |
keywords | array | Tags and keywords (ingredient highlights, technique, difficulty, etc.) |
image_urls | array | Full-resolution image URLs |
date_published | string | ISO 8601 publication date |
Discovery
By default the actor walks the official NYT Cooking sitemap index (https://www.nytimes.com/sitemaps/new/cooking.xml.gz), which contains monthly sub-sitemaps covering the full recipe inventory. Only /recipes/ paths are collected — article and guide pages are excluded.
Inputs
| Input | Type | Default | Description |
|---|---|---|---|
maxItems | integer | 10 | Maximum number of recipes to collect. Set to 0 for no limit (full catalog run). |
startUrls | array | — | Optional list of specific NYT Cooking recipe URLs to scrape directly, bypassing sitemap discovery. Useful for targeted single-recipe or small-batch runs. |
Data source
All data is extracted from the schema.org/Recipe JSON-LD markup that NYT Cooking embeds in every public recipe page for SEO purposes. Recipe content — including ingredients, instructions, and metadata — is publicly available. The NYT Cooking paywall only gates account-specific features (recipe box, personal notes, collections) and does not restrict access to recipe markup.
Usage notes
- For a full catalog run (~25K recipes), use
maxItems: 0and allow sufficient run time. - Nutrition data (
nutritionfield) is present on most recipes but absent on some recently published ones; the field isnullin those cases. - The sitemap updates frequently (new recipes appear within hours of publication). Re-running with
maxItems: 0against the latest sub-sitemaps will catch additions.