# italy-grocery-deals-weekly-ads-scraper (`ayrtondavoli97/italy-grocery-deals-weekly-ads-scraper`) Actor

Extract current grocery deals from major Italian chains, including Conad, Lidl, Aldi, Eurospin, MD, Famila/Iperfamila and iN’s Mercato. Get structured product names, prices, validity dates, flyer references, images and data-quality metadata for analytics and price monitoring.

- **URL**: https://apify.com/ayrtondavoli97/italy-grocery-deals-weekly-ads-scraper.md
- **Developed by:** [Francesco Ayrton Davoli](https://apify.com/ayrtondavoli97) (community)
- **Categories:** AI, E-commerce, Integrations
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.50 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Italy Grocery Deals & Weekly Ads Scraper

Apify Actor for extracting current promotional offers from Italian grocery flyers into a structured dataset suitable for comparison, price monitoring and downstream analytics.

### Current coverage

The Actor automatically detects active flyers and extracts products from structured flyer-offer APIs where available. Clean structured output is the default behaviour.

| Chain | Current extraction quality | Validated current products* |
| --- | --- | ---: |
| Famila / Iperfamila | Structured API offers — validated | 1,126 |
| Conad Superstore | Structured API offers — validated | 360 |
| Eurospin | Structured API offers — validated | 205 |
| Aldi | Structured API offers — validated | 202 |
| Lidl | Structured API offers — validated | 143 |
| iN's Mercato | Structured API offers — validated | 124 |
| MD Discount | Structured API offers — validated | 87 |
| Esselunga | Preview fallback only when available; excluded by default | — |

*Validated on 2026-06-04. Product totals depend on currently active flyers and can change at each run.

A clean structured validation run confirmed at least **2,247 API-quality current promotional products** across the seven supported grocery chain sources. Famila/Iperfamila was validated separately at its complete current catalogue size after an earlier all-chain run was intentionally capped at 1,000 records for that chain.

### Clean output by default

The default input enables `structuredOnly: true`. This excludes preview-only records from the dataset, keeping standard output suitable for data pipelines and price monitoring.

Set `structuredOnly` to `false` only when you intentionally want to include marked fallback records:

```json
{
  "catena": "tutti",
  "structuredOnly": false,
  "maxItemsPerChain": 2000,
  "maxTotalItems": 10000
}
````

Fallback records are always identified with:

```json
{
  "extractionSource": "preview_text",
  "dataQuality": "preview_fallback"
}
```

### Output fields

Each structured dataset record can include:

| Field | Description |
| --- | --- |
| `name` | Product name |
| `catena` | Store chain display name; Famila source can return both `Famila` and `Iperfamila` |
| `chainSlug` | Stable source identifier |
| `priceOffer` | Promotional price as source-compatible text |
| `priceOfferValue` | Promotional price as a numeric value for analytics |
| `priceOriginal` | Original price text, where supplied by the source |
| `priceOriginalValue` | Original price numeric value, where supplied by the source |
| `currency` | Currency code, `EUR` |
| `country` | Country code, `IT` |
| `format` | Quantity or packaging format |
| `validFrom`, `validTo`, `validity` | Flyer validity period when detectable |
| `flyerId`, `pageNumber`, `offerId` | Source flyer/product references |
| `img` | Flyer page or product image URL where supplied |
| `sourcePageUrl` | Chain flyer source page |
| `offersApiUrl` | API endpoint used for structured extraction |
| `extractionSource` | `offers_api` or `preview_text` |
| `dataQuality` | `structured_api` or `preview_fallback` |
| `offerKey` | Stable key for one current promotional offer |
| `productFingerprint` | Product-matching helper for historical comparison |
| `scrapedAt` | ISO timestamp of the Actor run |

### Recommended all-chain input

```json
{
  "catena": "tutti",
  "keyword": "",
  "structuredOnly": true,
  "maxItemsPerChain": 2000,
  "maxTotalItems": 10000,
  "diagnosticMode": false,
  "investigateZeroResults": false
}
```

#### Limit behaviour

For a single selected chain, use `maxItems`.

For `catena: "tutti"`, use:

- `maxItemsPerChain`: maximum products retained from each supermarket, preventing one large catalogue from exhausting the output before other chains are processed. The recommended value is `2000`, which is above the currently validated Famila/Iperfamila catalogue of 1,126 records.
- `maxTotalItems`: overall run safety ceiling.

Example: `maxItemsPerChain: 100` gives each available chain room to return up to 100 products rather than letting the first large chain consume a single global quota.

### Product keyword filter

Use `keyword` to filter product names, for example:

```json
{
  "catena": "tutti",
  "keyword": "birra",
  "structuredOnly": true,
  "maxItemsPerChain": 2000,
  "maxTotalItems": 10000
}
```

The legacy `categoria` input remains accepted by the code for backward compatibility, but the current source does not consistently provide true category metadata. `keyword` is therefore the accurate public filter.

### Run summary

Every run stores a `RUN_SUMMARY` JSON record in the default key-value store. It includes total products saved, limit settings, quality mode and per-chain extraction status.

Possible statuses include:

- `structured_api`: clean structured products were obtained.
- `preview_fallback`: fallback records were included because `structuredOnly` was disabled.
- `preview_fallback_excluded`: preview products existed but were excluded from clean output.
- `no_active_flyers_detected`: no supported active flyer cards were detected.
- `active_flyers_without_structured_offers`: flyers exist but no structured products were returned.
- `skipped_total_limit_reached`: the global safety ceiling was reached before this chain was processed.

### Diagnostics

For development and source validation only:

- `diagnosticMode: true` stores API response diagnostics for structured flyers.
- `investigateZeroResults: true` stores page HTML, screenshot, DOM audit and click probes only for chains with zero structured API products.

### Notes

This Actor extracts publicly displayed promotional information through the flyer source experience. Results represent current visible or structured offers at run time; availability and completeness depend on active flyers and source-site coverage.

# Actor input Schema

## `catena` (type: `string`):

Select one chain or scrape all supported grocery chains.

## `keyword` (type: `string`):

Return only products whose name or source category contains the term, for example: pasta, birra, latte, detersivo.

## `structuredOnly` (type: `boolean`):

Recommended. Excludes preview-only fallback records, such as currently available Esselunga preview items, from the output dataset.

## `maxItems` (type: `integer`):

Used only when you select a single supermarket chain.

## `maxItemsPerChain` (type: `integer`):

Used for All grocery chains. The default is above the currently validated Famila catalogue size, so large chains are not silently cut while each chain still has its own safety cap.

## `maxTotalItems` (type: `integer`):

Safety ceiling for an all-chains run. Keep this greater than the expected sum of products across chains for full coverage.

## `diagnosticMode` (type: `boolean`):

Stores API response diagnostics for structured flyers. Use for development only.

## `investigateZeroResults` (type: `boolean`):

Development mode: for chains returning zero API products, saves page HTML, screenshot, DOM audit and click probes. Does not change extracted output.

## `proxyConfig` (type: `object`):

Optional Apify Proxy configuration.

## Actor input object example

```json
{
  "catena": "lidl",
  "keyword": "",
  "structuredOnly": true,
  "maxItems": 200,
  "maxItemsPerChain": 2000,
  "maxTotalItems": 10000,
  "diagnosticMode": false,
  "investigateZeroResults": false,
  "proxyConfig": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `deals` (type: `string`):

Structured current promotional products extracted from Italian grocery flyers. Use the dataset view for table display or retrieve all items via API.

## `runSummary` (type: `string`):

JSON summary containing total output count, quality mode, per-chain coverage, extraction status and limit configuration.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "catena": "lidl",
    "keyword": "",
    "structuredOnly": true,
    "maxItems": 200,
    "maxItemsPerChain": 2000,
    "maxTotalItems": 10000,
    "diagnosticMode": false,
    "investigateZeroResults": false,
    "proxyConfig": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("ayrtondavoli97/italy-grocery-deals-weekly-ads-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "catena": "lidl",
    "keyword": "",
    "structuredOnly": True,
    "maxItems": 200,
    "maxItemsPerChain": 2000,
    "maxTotalItems": 10000,
    "diagnosticMode": False,
    "investigateZeroResults": False,
    "proxyConfig": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("ayrtondavoli97/italy-grocery-deals-weekly-ads-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "catena": "lidl",
  "keyword": "",
  "structuredOnly": true,
  "maxItems": 200,
  "maxItemsPerChain": 2000,
  "maxTotalItems": 10000,
  "diagnosticMode": false,
  "investigateZeroResults": false,
  "proxyConfig": {
    "useApifyProxy": true
  }
}' |
apify call ayrtondavoli97/italy-grocery-deals-weekly-ads-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=ayrtondavoli97/italy-grocery-deals-weekly-ads-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "italy-grocery-deals-weekly-ads-scraper",
        "description": "Extract current grocery deals from major Italian chains, including Conad, Lidl, Aldi, Eurospin, MD, Famila/Iperfamila and iN’s Mercato. Get structured product names, prices, validity dates, flyer references, images and data-quality metadata for analytics and price monitoring.",
        "version": "0.0",
        "x-build-id": "YdD0zrJbr5YUzyGmc"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/ayrtondavoli97~italy-grocery-deals-weekly-ads-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-ayrtondavoli97-italy-grocery-deals-weekly-ads-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/ayrtondavoli97~italy-grocery-deals-weekly-ads-scraper/runs": {
            "post": {
                "operationId": "runs-sync-ayrtondavoli97-italy-grocery-deals-weekly-ads-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/ayrtondavoli97~italy-grocery-deals-weekly-ads-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-ayrtondavoli97-italy-grocery-deals-weekly-ads-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "catena": {
                        "title": "Supermarket chain",
                        "enum": [
                            "tutti",
                            "conad",
                            "lidl",
                            "eurospin",
                            "md",
                            "aldi",
                            "famila",
                            "ins",
                            "esselunga"
                        ],
                        "type": "string",
                        "description": "Select one chain or scrape all supported grocery chains.",
                        "default": "tutti"
                    },
                    "keyword": {
                        "title": "Product keyword filter (optional)",
                        "type": "string",
                        "description": "Return only products whose name or source category contains the term, for example: pasta, birra, latte, detersivo.",
                        "default": ""
                    },
                    "structuredOnly": {
                        "title": "Return only structured API-quality records",
                        "type": "boolean",
                        "description": "Recommended. Excludes preview-only fallback records, such as currently available Esselunga preview items, from the output dataset.",
                        "default": true
                    },
                    "maxItems": {
                        "title": "Maximum products for a single selected chain",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Used only when you select a single supermarket chain.",
                        "default": 2000
                    },
                    "maxItemsPerChain": {
                        "title": "Maximum products per chain when scraping all",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Used for All grocery chains. The default is above the currently validated Famila catalogue size, so large chains are not silently cut while each chain still has its own safety cap.",
                        "default": 2000
                    },
                    "maxTotalItems": {
                        "title": "Maximum total products when scraping all",
                        "minimum": 1,
                        "maximum": 50000,
                        "type": "integer",
                        "description": "Safety ceiling for an all-chains run. Keep this greater than the expected sum of products across chains for full coverage.",
                        "default": 10000
                    },
                    "diagnosticMode": {
                        "title": "Save API diagnostics",
                        "type": "boolean",
                        "description": "Stores API response diagnostics for structured flyers. Use for development only.",
                        "default": false
                    },
                    "investigateZeroResults": {
                        "title": "Investigate chains without structured products",
                        "type": "boolean",
                        "description": "Development mode: for chains returning zero API products, saves page HTML, screenshot, DOM audit and click probes. Does not change extracted output.",
                        "default": false
                    },
                    "proxyConfig": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional Apify Proxy configuration.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
