# Lidl Product Scraper (`shahidirfan/lidl-product-scraper`) Actor

Scrape Lidl's full product database instantly. Extract pricing, descriptions, categories, availability & images. Build price comparison tools, monitor competitor data, power retail analytics, create AI datasets & track market trends. European groceries.

- **URL**: https://apify.com/shahidirfan/lidl-product-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Lidl Product Scraper

Extract Lidl US product data from search pages, product listings, and individual product pages. Collect current pricing, images, GTINs, stock status, merchandising details, and descriptive product content in a structured dataset for research, catalog monitoring, and assortment analysis.

### Features

- **Keyword and URL input** — Start from a search keyword or a Lidl product URL
- **Supports multiple Lidl page types** — Search pages, product list pages, category-backed list pages, and product detail pages
- **Rich product coverage** — Collect pricing, package details, images, stock status, and product descriptions
- **Built-in pagination control** — Limit collection by result count and maximum pages
- **Clean datasets** — Omits null-only values so exports stay easier to use

### Use Cases

#### Assortment Monitoring
Track Lidl product availability, prices, and merchandising placement across repeated runs. Build structured snapshots of changing product catalogs over time.

#### Price Intelligence
Collect current price, regular price, promotional price, and base-price text for downstream analysis. Compare pricing changes across searches, categories, or recurring products.

#### Product Research
Gather product descriptions, package types, origin details, and GTINs for catalog enrichment. Use the output in internal product databases or analytics workflows.

#### Competitive Analysis
Monitor how Lidl structures search results and broad product listings. Compare assortment breadth and packaging across categories relevant to your business.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `url` | String | No | `https://www.lidl.com/search/products/milk` | Lidl search, product list, product detail, or supported mobile API URL |
| `keyword` | String | No | `milk` | Search term used when `url` is not provided |
| `results_wanted` | Integer | No | `20` | Maximum number of products to collect |
| `max_pages` | Integer | No | `5` | Safety cap on paginated requests |
| `proxyConfiguration` | Object | No | `{"useApifyProxy": false}` | Optional proxy settings |

At least one of `url` or `keyword` must be provided. If both are supplied, `url` takes priority.

---

### Output Data

Each dataset item can contain:

| Field | Type | Description |
|-------|------|-------------|
| `rank` | Integer | Position of the product in the collected output |
| `source_type` | String | Input mode used for this record |
| `source_url` | String | Lidl page or API URL used as the source |
| `source_keyword` | String | Search keyword, when applicable |
| `source_total_results` | Integer | Total products reported by the source |
| `store_id` | String | Store identifier used for price context |
| `product_id` | String | Lidl product ID |
| `item_id` | String | Lidl item ID |
| `product_url` | String | Lidl product detail URL |
| `name` | String | Product name |
| `gtin` | String | Primary GTIN |
| `alternative_gtins` | Array | Additional GTIN values |
| `description` | String | Short package description |
| `long_description_html` | String | Rich product description |
| `long_description_text` | String | Plain-text product description |
| `primary_image_url` | String | Main product image |
| `image_urls` | Array | All product image URLs |
| `image_ids` | Array | Product image asset IDs |
| `category_ids` | Array | Category IDs associated with the product |
| `brand_names` | Array | Brand names when present |
| `quantity` | Number | Package quantity |
| `package_type` | String | Package type code |
| `weight` | Number | Weight when available |
| `height` | Number | Height when available |
| `length` | Number | Length when available |
| `volume` | Number | Volume when available |
| `origin_country` | String | Country of origin |
| `origin_region` | String | Region of origin |
| `allergens` | Array | Listed allergens |
| `ingredients` | Array | Listed ingredients |
| `current_price` | Number | Current price |
| `regular_price` | Number | Regular price |
| `promotion_price` | Number | Promotional price |
| `mylidl_price` | Number | myLidl member price |
| `upcoming_price` | Number | Upcoming price |
| `currency` | String | Currency code |
| `base_price_text` | String | Base-price text such as per-ounce cost |
| `base_quantity_value` | Number | Base quantity value |
| `base_quantity_unit` | String | Base quantity unit |
| `current_price_type` | String | Price type label |
| `current_price_start_date` | String | Price start date |
| `current_price_end_date` | String | Price end date |
| `stock_status_code` | String | Stock status |
| `merchandising_module` | String | Merchandising module description |
| `aisle` | Number | Aisle number |
| `tags` | Array | Product tags |
| `contains_alcohol` | Boolean | Whether the product contains alcohol |
| `bogo_info` | Object | Buy-one-get-one information when present |

Only fields with real values are saved.

---

### Usage Examples

#### Search by Keyword

```json
{
    "keyword": "milk",
    "results_wanted": 20,
    "max_pages": 3
}
````

#### Start from a Lidl Search URL

```json
{
    "url": "https://www.lidl.com/search/products/milk",
    "results_wanted": 30,
    "max_pages": 4
}
```

#### Collect from a Lidl Product Listing

```json
{
    "url": "https://www.lidl.com/products?category=all&sort=productAtoZ",
    "results_wanted": 25,
    "max_pages": 2
}
```

#### Collect a Single Product Page

```json
{
    "url": "https://www.lidl.com/products/1257738"
}
```

### Sample Output

```json
{
    "rank": 1,
    "source_type": "searchUrl",
    "source_url": "https://www.lidl.com/search/products/milk",
    "source_keyword": "milk",
    "source_total_results": 288,
    "store_id": "US01053",
    "product_id": "1257738",
    "item_id": "245421",
    "product_url": "https://www.lidl.com/products/1257738",
    "name": "fairlife fat free ultra-filtered milk",
    "gtin": "0856312002757",
    "alternative_gtins": [
        "0856312002757",
        "856312002757"
    ],
    "description": "52 fl. oz.",
    "long_description_text": "Next time you pour a nice glass of cold milk, be confident about the quality of what you’re drinking. With 50% less sugar than regular milk, 13 grams of high quality protein per serving and no artificial growth hormones, fairlife fat free ultra-filtered milk is the choice to make when you want the best.",
    "primary_image_url": "https://production-endpoint.azureedge.net/images/74P32D1G6LFJAC1GF0QJ0C0/224fc2e6-a089-46c0-96fc-6db99c0c928d/921405_500x500_500x500.tif.jpg",
    "category_ids": [
        "OCI1000079",
        "OCI2000110",
        "OCI1000136",
        "OC006335"
    ],
    "package_type": "FL",
    "current_price": 5.32,
    "regular_price": 5.32,
    "currency": "USD",
    "base_price_text": "10.2 ¢ per fl.oz.",
    "stock_status_code": "INSTOCK",
    "merchandising_module": "Chiller",
    "aisle": 1,
    "contains_alcohol": false
}
```

***

### Tips for Best Results

#### Start Small First

- Begin with `results_wanted: 20` to validate the exact page type you want
- Increase limits only after confirming the returned fields match your needs

#### Use the Most Specific URL You Have

- Use search URLs for focused keyword pulls
- Use product listing URLs when you want broad category-style coverage
- Use product detail URLs when you only need one product

#### Keep Pagination Intentional

- Use `max_pages` as a safety limit for broad product listings
- Combine `results_wanted` and `max_pages` to control runtime and export size

### Integrations

Connect your data with:

- **Google Sheets** — Build sortable product and pricing trackers
- **Airtable** — Maintain searchable product catalogs
- **Make** — Trigger monitoring workflows and alerts
- **Zapier** — Send product updates into other business systems
- **Webhooks** — Forward fresh runs into your own pipeline

#### Export Formats

- **JSON** — For APIs, scripts, and structured processing
- **CSV** — For spreadsheet analysis
- **Excel** — For business reporting
- **XML** — For external system integrations

***

### Frequently Asked Questions

#### Can I use a keyword instead of a URL?

Yes. If you provide `keyword`, the actor collects products from Lidl search results.

#### Which Lidl URLs are supported?

Search URLs, product listing URLs, product detail URLs, and equivalent supported mobile API URLs.

#### What happens if I provide both `url` and `keyword`?

`url` takes priority so the actor follows the exact page you supplied.

#### Does the actor save empty or null fields?

No. Fields without useful values are removed before the item is stored.

#### Can I control how many products are returned?

Yes. Use `results_wanted` and `max_pages` together to control collection size.

### Support

For issues or feature requests, contact support through the Apify Console.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [Apify API Reference](https://docs.apify.com/api/v2)
- [Apify Schedules](https://docs.apify.com/platform/schedules)

***

### Legal Notice

This actor is designed for legitimate data collection purposes. Users are responsible for ensuring compliance with Lidl terms, applicable laws, and their own downstream data usage requirements.

# Actor input Schema

## `url` (type: `string`):

Supported examples: Lidl search URL, product list URL, product detail URL, or supported mobile API URL.

## `keyword` (type: `string`):

Search term used when URL is not provided.

## `results_wanted` (type: `integer`):

Maximum number of products to collect.

## `max_pages` (type: `integer`):

Safety cap for paginated requests.

## `proxyConfiguration` (type: `object`):

Optional proxy settings.

## Actor input object example

```json
{
  "url": "https://www.lidl.com/search/products/milk",
  "keyword": "milk",
  "results_wanted": 20,
  "max_pages": 3,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://www.lidl.com/search/products/milk",
    "keyword": "milk",
    "results_wanted": 20,
    "max_pages": 3
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/lidl-product-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "url": "https://www.lidl.com/search/products/milk",
    "keyword": "milk",
    "results_wanted": 20,
    "max_pages": 3,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/lidl-product-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://www.lidl.com/search/products/milk",
  "keyword": "milk",
  "results_wanted": 20,
  "max_pages": 3
}' |
apify call shahidirfan/lidl-product-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/lidl-product-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Lidl Product Scraper",
        "description": "Scrape Lidl's full product database instantly. Extract pricing, descriptions, categories, availability & images. Build price comparison tools, monitor competitor data, power retail analytics, create AI datasets & track market trends. European groceries.",
        "version": "0.0",
        "x-build-id": "M9uZL3X9E96HoojOb"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~lidl-product-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-lidl-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~lidl-product-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-lidl-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~lidl-product-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-lidl-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "url": {
                        "title": "Lidl URL",
                        "type": "string",
                        "description": "Supported examples: Lidl search URL, product list URL, product detail URL, or supported mobile API URL."
                    },
                    "keyword": {
                        "title": "Keyword",
                        "type": "string",
                        "description": "Search term used when URL is not provided."
                    },
                    "results_wanted": {
                        "title": "Results wanted",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of products to collect.",
                        "default": 20
                    },
                    "max_pages": {
                        "title": "Max pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Safety cap for paginated requests.",
                        "default": 5
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional proxy settings.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
