Diabetic Recipe Scraper avatar
Diabetic Recipe Scraper

Pricing

Pay per usage

Go to Apify Store
Diabetic Recipe Scraper

Diabetic Recipe Scraper

Introducing the Diabetic Recipe Scraper, a lightweight actor for efficiently scraping healthy, sugar-conscious recipes and nutrition data. Fast and simple. For best results and seamless extraction without blocking, the use of residential proxies is strongly advised. Automate your health data!

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Diabetes Food Hub Recipes Scraper

Scrape diabetes-friendly lunch recipes from Diabetes Food Hub. This Apify actor extracts structured recipe data, including ingredients, instructions, nutrition facts, and more, using JSON-LD parsing first and HTML fallback. Perfect for collecting diabetic recipes, meal planning, and nutritional analysis.

Features

  • Targeted Scraping: Focuses on lunch recipes from https://diabetesfoodhub.org/recipes/lunch or custom category pages.
  • Data Priority: Leverages JSON-LD structured data for accurate extraction, with robust HTML parsing as backup.
  • Comprehensive Details: Captures titles, descriptions, ingredients lists, step-by-step instructions, prep/cook times, servings, images, and detailed nutrition information.
  • Pagination Support: Automatically navigates through listing pages to collect the desired number of recipes.
  • Flexible Input: Supports search queries, custom start URLs, or category-based scraping for various recipe types.
  • Proxy Integration: Uses Apify Proxy to handle rate limits and access restrictions.
  • Clean Output: Produces normalized, ready-to-use JSON records for easy integration into apps, databases, or analysis tools.

Input

Configure the actor via the Apify input interface or JSON payload. Below is a sample configuration:

{
"searchQuery": "chicken",
"startUrls": [
{
"url": "https://diabetesfoodhub.org/recipes/roasted-veggie-grain-bowl-seared-tilapia-herbed-yogurt-sauce"
},
{
"url": "https://diabetesfoodhub.org/recipes/lunch"
}
],
"category": "lunch",
"results_wanted": 100,
"max_pages": 30,
"proxyConfiguration": {
"useApifyProxy": true
}
}

Input Fields

  • searchQuery (string, optional): Keyword to search for recipes (e.g., "chicken", "salad"). If provided, starts from search results and overrides category/startUrls.
  • startUrls (array of objects, optional): List of URLs to start scraping from. Each object should have a url key. Supports listing pages (e.g., category URLs) or individual recipe pages. If provided, overrides category and search.
  • category (string, default: "lunch"): The recipe category slug (e.g., "lunch", "breakfast", "dinner"). Used if searchQuery and startUrls are empty.
  • results_wanted (integer, default: 100): Maximum number of recipes to collect before stopping.
  • max_pages (integer, default: 30): Maximum number of listing pages to visit as a safety limit.
  • proxyConfiguration (object, recommended): Proxy settings to avoid blocking. Enable Apify Proxy for best results.

Output

The actor saves results to the default Apify dataset. Each item is a JSON object with the following structure:

{
"title": "Turkey Spinach Wrap",
"description": "A quick and healthy lunch wrap packed with protein and veggies.",
"ingredients": [
"2 cups fresh spinach",
"4 oz sliced turkey breast",
"1 whole wheat tortilla"
],
"instructions": [
"Lay the tortilla flat on a plate.",
"Spread the turkey evenly over the tortilla.",
"Add the spinach on top.",
"Roll the tortilla tightly and cut in half."
],
"prep_time": "10 minutes",
"cook_time": null,
"total_time": "10 minutes",
"servings": "2",
"image": "https://diabetesfoodhub.org/sites/default/files/styles/recipe_image/public/recipe-images/TurkeySpinachWrap.jpg",
"nutrition": {
"calories": "220",
"carbohydrateContent": "18g",
"proteinContent": "15g",
"fatContent": "8g"
},
"tags": "Lunch",
"category": "lunch",
"url": "https://diabetesfoodhub.org/recipes/turkey-spinach-wrap",
"source": "diabetesfoodhub.org"
}

Field Descriptions

  • title: Recipe name.
  • description: Brief overview of the recipe.
  • ingredients: Array of ingredient strings.
  • instructions: Array of step-by-step directions.
  • prep_time, cook_time, total_time: Time strings (e.g., "10 minutes").
  • servings: Number of servings as a string.
  • image: URL to the recipe image.
  • nutrition: Object with nutritional facts (varies by recipe).
  • tags: Recipe tags or categories.
  • category: The scraped category.
  • url: Full URL to the recipe page.
  • source: Data source identifier.

Usage

  1. Access the Actor: Find and open the "Diabetes Food Hub Recipes Scraper" on the Apify platform.
  2. Configure Input:
    • For lunch recipes, leave startUrls empty and keep category as "lunch".
    • To scrape other categories, change category to "breakfast", "dinner", etc., or provide specific URLs in startUrls.
    • Set results_wanted to your desired number of recipes.
    • Enable proxyConfiguration for reliable scraping.
  3. Run the Actor: Click "Run" to start the scraping process.
  4. Monitor Progress: Check the actor's run log for status updates.
  5. Download Results: Once complete, export the dataset in JSON, CSV, Excel, or access via API.

Example Use Cases

  • Meal Planning Apps: Integrate diabetic-friendly recipes for users with diabetes.
  • Nutritional Analysis: Use extracted nutrition data for diet tracking.
  • Content Aggregation: Build a database of healthy lunch ideas.
  • Research: Collect data for studies on diabetic meal options.

Configuration Tips

  • Handling Large Volumes: Increase max_pages if you need more recipes, but monitor for rate limits.
  • Custom Categories: Use startUrls for non-standard pages or subcategories.
  • Proxy Settings: For international access, specify proxy groups like "RESIDENTIAL" in proxyConfiguration.
  • Data Quality: Recipes with JSON-LD provide the most accurate data; HTML fallback ensures coverage.

Notes

  • The scraper respects the site's structure and avoids overloading servers.
  • Nutrition information is extracted as available; not all recipes include full details.
  • For best performance, run during off-peak hours and use proxies.
  • If the website updates its layout, minor adjustments may be needed (contact support if issues persist).

Changelog

  • 1.0.0: Initial release with JSON-LD priority, HTML fallback, pagination, and full recipe extraction for Diabetes Food Hub.