Lidl Category Scraper
Pricing
from $5.00 / 1,000 category results
Lidl Category Scraper
Scrape Lidl's full category tree (lidl.nl): every main category and subcategory with name, id, parent and level, discovered dynamically. Clean JSON/CSV, ideal as input for the Lidl product scraper. Needs a Dutch (NL) proxy. Failed lookups are never billed.
Pricing
from $5.00 / 1,000 category results
Rating
0.0
(0)
Developer
Elena Vance
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Lidl Category Scraper — Map the Full Lidl.nl Category Tree (IDs, Names, Levels)
Get Lidl Netherlands' entire category tree in one run: every main department and every leaf subcategory, with its stable category ID, name, slug, parent, tree level, product count, and direct link — one clean record per category.
The output is built to be the starting point for scraping Lidl products: pipe the category IDs straight into the Lidl Product Scraper. No login, no setup — and you are never billed for failed requests.
Good to know: the default Automatic (datacenter) proxy works fine and keeps costs low — no Dutch proxy required. (If you ever see blocked requests, switch to a Dutch residential proxy.) Everything else is handled for you, so there's nothing to configure.
Why this Actor
- The whole tree, always current. Lidl's category structure isn't published as a tidy sitemap, and it shifts over time. This Actor returns today's tree on every run, so you get the live structure — not a stale hand-built list.
- Built to feed the product scraper. Each leaf subcategory comes with the exact category ID the Lidl Product Scraper takes as input — map once, then scrape products by category with zero guesswork.
- Main vs. leaf, clearly labeled. Every record says whether it's a top-level
department (
type: "main",level: 0) or a product-holding subcategory (type: "leaf",level: 1), and links each leaf to its parent — so you can rebuild the hierarchy exactly. - Product counts per category. Each category carries
productCount, so you can see at a glance which departments are big and prioritize what to scrape. - You never pay for failures. Any category that can't be retrieved is reported in the run summary — not written to your dataset and not billed.
- Fast and lightweight. No headless browser, so runs are quick and cheap.
- Clean, consistent output. Whitespace-normalized names, stable IDs, JSON-safe values throughout, ready for a spreadsheet, database, or the next Actor.
Problems this Actor solves
| If you are… | Your problem | How this Actor solves it |
|---|---|---|
| Running the Lidl product scraper | You need the list of category IDs to scrape, and they change over time | Get every current leaf category ID; feed them straight into the product Actor |
| A market researcher / analyst | You want to understand how a retailer organizes its assortment | A complete, dated map of Lidl's departments and subcategories with product counts |
| A price-comparison / catalog builder | Hard-coding category paths breaks whenever the site is reorganized | Re-map the tree on a schedule; detect added/removed/renamed categories automatically |
| A developer building a pipeline | You need a reliable seed list to drive a larger crawl | One normalized record per category — IDs, parents, levels — to orchestrate the rest |
What data you get
Each discovered category becomes one dataset record:
| Field | Description |
|---|---|
id | Lidl's stable category ID, with type prefix: s… for a main category, h… for a leaf subcategory (the record's unique id) |
name | Category display name (e.g. Fruit & Groenten) |
slug | URL slug (e.g. fruit-groenten) |
parentId | The parent category's id — null for a main category, the s… id for a leaf |
level | Tree depth: 0 = main department, 1 = leaf subcategory |
type | "main" or "leaf" — leaf categories are the ones that actually hold products |
productCount | Approximate number of products in the category (may be null if not reported) |
url | Canonical lidl.nl category page link |
apiPath | A relative path for this category — handy if you build your own integrations |
source | Constant tag identifying the producing Actor |
scrapedAt | ISO 8601 timestamp of the run |
rawData | Optional: a small raw payload describing the category, when Include raw payload is on |
Example output
// A main department (top of the tree){"id": "s10068374","name": "Eten & Drinken","slug": "eten-en-drinken","parentId": null,"level": 0,"type": "main","productCount": 412,"url": "https://www.lidl.nl/c/eten-en-drinken/s10068374","scrapedAt": "2026-06-18T08:30:00+00:00"}
// A leaf subcategory under it — this is what you feed the product scraper{"id": "h10071012","name": "Fruit & Groenten","slug": "fruit-groenten","parentId": "s10068374","level": 1,"type": "leaf","productCount": 87,"url": "https://www.lidl.nl/h/fruit-groenten/h10071012","scrapedAt": "2026-06-18T08:30:00+00:00"}
Categories that could not be retrieved are not written to the dataset (and never
billed). They are listed in the run's SUMMARY record in the key-value store —
{ "failures": [ { "input": "…", "error": "…" } ] } — and the run's status message
tells you at a glance how many categories were returned.
How to use it (60 seconds)
- Click Try for free / Start.
- There's nothing required to configure — the Actor returns the whole tree by default. (Optionally cap the run with Max items, or turn on Include raw payload.)
- Leave the Automatic (datacenter) proxy as-is — it works out of the box. (Only if you hit blocked requests, switch to a Dutch residential proxy.)
- Click Save & Start. Download results as JSON, CSV, Excel, or via API from the Dataset tab; check Key-value store → SUMMARY for run totals and any failed requests.
- Next step: take the
idof everytype: "leaf"record and feed it to the Lidl Product Scraper to pull the products in those categories.
Input reference
| Field | Type | Default | Description |
|---|---|---|---|
| Include raw payload | boolean | false | Adds a small raw payload describing each category under rawData. Leave off unless you need it. |
| Proxy configuration | object | Apify Automatic | Automatic (datacenter) proxies work fine and keep costs low. Switch to a Dutch (NL) residential proxy only if you see blocked requests. |
| Max concurrency | integer | 4 | Parallel requests (1–20). Kept moderate to be respectful. |
| Delay between requests | integer | 0 | Politeness delay in seconds before each request (0–10). 0 is fine at moderate concurrency. |
| Max items | integer | 0 | Stop after N category records (0 = unlimited; the full tree). |
There is no category-list or keyword input: this Actor's job is to map the whole tree, so it always returns the full category map.
What you get out of it
Lidl's category structure isn't a static, published list — and it changes over time. This Actor takes care of assembling it for you and returns a clean, de-duplicated map every run: main departments first, then their leaf subcategories, each as one record with its id, name, parent, level, type, product count, and link. There's nothing to configure — just start the run and read the dataset.
Pricing — what a run costs
This Actor uses transparent pay-per-event pricing with a built-in volume discount: a small Actor-start fee, a fixed price per category record returned, and a cheaper rate for results beyond a large per-run threshold. No subscription, no minimums, and failures are never charged. The exact per-result rate is shown on the Actor's Pricing tab.
- A full tree is a small, cheap run. Lidl has on the order of dozens-to-low- hundreds of categories, so a complete run is inexpensive — its real value is as the seed for a much larger product crawl.
- Failed requests are free. Any category that can't be retrieved is reported in the summary, never billed.
- Try it free: an Apify free account includes $5 of monthly platform credit — enough to run this Actor many times before paying anything.
- Stay in control: set Max items and Apify's maximum charge per run; the Actor stops gracefully at your cap, keeping everything already discovered.
Compared to the alternatives
| This Actor | Hard-code the category list | Build your own crawler | Click through the site | |
|---|---|---|---|---|
| Always-current tree | Yes (refreshed every run) | Goes stale on every reorg | You maintain it | Tedious, error-prone |
| Main + leaf, with parent links | Yes | Partial at best | You maintain it | Manual note-taking |
| Stable IDs ready for the product scraper | Yes | You look them up by hand | You extract them | Copy-paste per category |
| Works out of the box | Yes (built in) | – | You build and maintain it | – |
| Never billed for failures | Yes | – | – | – |
| Export JSON / CSV / Excel / API | Yes, built-in | DIY | DIY | Copy-paste |
| Setup time | ~60 seconds | Hours, recurring | Days; breaks on site changes | Hours per refresh |
Integrate the data
- Feed the Lidl Product Scraper (the main use case). Pull this dataset, keep
the records where
type == "leaf", and pass theiridvalues as the category input to the Lidl Product Scraper. You now have a fully automated map-then-scrape pipeline that adapts when Lidl reorganizes. - Exports: JSON, CSV, Excel, XML from the Dataset tab — or fetch
programmatically:
GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
- Run on a schedule: use Apify Schedules to refresh the tree weekly and webhooks to kick off the product scraper automatically when the run finishes.
- From code: call the Actor with the Apify API or SDKs (Python / JavaScript) and read the dataset when the run finishes.
- Run summary: every run writes a
SUMMARYrecord (key-value store) with totals, successes, failures, and billing counts — ideal for monitoring automated pipelines.
FAQ
What does this Actor actually return — products or categories? Categories. One record per category (main departments and leaf subcategories), not products. To get products, feed the leaf category IDs into the Lidl Product Scraper.
Do I need a Lidl account or login? No. There is no login or key to manage — just start the run.
Do I need a Dutch proxy? No. The default Automatic (datacenter) proxy works fine — Lidl's API returns the national (NL) category tree regardless of where the request comes from. Only if you ever see blocked requests, switch to a Dutch (NL) residential proxy.
What's the difference between a main and a leaf category?
main (level 0, id s…) are the top departments shown on the homepage; leaf
(level 1, id h…) are the subcategories under them — the ones that actually hold
products. For scraping products, you want the leaves.
How many categories will I get? The full Lidl tree — typically dozens of departments and subcategories. The non-food assortment is broad; the food assortment rotates weekly and can be small, which is normal (a small or empty category is not an error).
Why is productCount sometimes null?
When a product count isn't available for a category, the field is null. The
category is still included.
How fresh is the data? Each run returns the tree as it stands at that moment, so you get exactly the structure lidl.nl shows then. Schedule the Actor to keep your map as fresh as you need.
What formats can I export? JSON, CSV, Excel, XML — from the Console or via the Apify API.
Related Actors
Pair this with the Lidl Product Scraper — that's the whole point. Run this Category
Scraper first to map the tree, then feed every type: "leaf" id into the:
- Lidl Product Scraper — turns those leaf category IDs into full product records (titles, prices, Lidl Plus member prices, unit prices, images). Map once here, scrape products there: a fully automated, reorg-proof map-then-scrape pipeline.
Building broader Dutch-supermarket coverage? The same clean schema and pay-per-event billing run across every chain — mix and match for full market coverage:
- Albert Heijn Product Scraper
- Plus Product Scraper
- Dirk van den Broek Product Scraper
- DekaMarkt Product Scraper
- Hoogvliet Product Scraper — and its own Hoogvliet Category Scraper, the family's other category-tree actor (same map-the-tree-then-scrape pairing as this one).
Same clean schema and pay-per-event billing across every Dutch chain — and you're never billed for failed requests in any of them.
Disclaimer
This Actor is intended for personal and research use. You are responsible for ensuring your use complies with Lidl's terms and applicable law. Please scrape responsibly — keep concurrency moderate and delays reasonable. This project is not affiliated with, endorsed by, or sponsored by Lidl.