Lidl Category Scraper avatar

Lidl Category Scraper

Pricing

from $5.00 / 1,000 category results

Go to Apify Store
Lidl Category Scraper

Lidl Category Scraper

Scrape Lidl's full category tree (lidl.nl): every main category and subcategory with name, id, parent and level, discovered dynamically. Clean JSON/CSV, ideal as input for the Lidl product scraper. Needs a Dutch (NL) proxy. Failed lookups are never billed.

Pricing

from $5.00 / 1,000 category results

Rating

0.0

(0)

Developer

Elena Vance

Elena Vance

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Lidl Category Scraper — Map the Full Lidl.nl Category Tree (IDs, Names, Levels)

Get Lidl Netherlands' entire category tree in one run: every main department and every leaf subcategory, with its stable category ID, name, slug, parent, tree level, product count, and direct link — one clean record per category.

The output is built to be the starting point for scraping Lidl products: pipe the category IDs straight into the Lidl Product Scraper. No login, no setup — and you are never billed for failed requests.

Good to know: the default Automatic (datacenter) proxy works fine and keeps costs low — no Dutch proxy required. (If you ever see blocked requests, switch to a Dutch residential proxy.) Everything else is handled for you, so there's nothing to configure.


Why this Actor

  • The whole tree, always current. Lidl's category structure isn't published as a tidy sitemap, and it shifts over time. This Actor returns today's tree on every run, so you get the live structure — not a stale hand-built list.
  • Built to feed the product scraper. Each leaf subcategory comes with the exact category ID the Lidl Product Scraper takes as input — map once, then scrape products by category with zero guesswork.
  • Main vs. leaf, clearly labeled. Every record says whether it's a top-level department (type: "main", level: 0) or a product-holding subcategory (type: "leaf", level: 1), and links each leaf to its parent — so you can rebuild the hierarchy exactly.
  • Product counts per category. Each category carries productCount, so you can see at a glance which departments are big and prioritize what to scrape.
  • You never pay for failures. Any category that can't be retrieved is reported in the run summary — not written to your dataset and not billed.
  • Fast and lightweight. No headless browser, so runs are quick and cheap.
  • Clean, consistent output. Whitespace-normalized names, stable IDs, JSON-safe values throughout, ready for a spreadsheet, database, or the next Actor.

Problems this Actor solves

If you are…Your problemHow this Actor solves it
Running the Lidl product scraperYou need the list of category IDs to scrape, and they change over timeGet every current leaf category ID; feed them straight into the product Actor
A market researcher / analystYou want to understand how a retailer organizes its assortmentA complete, dated map of Lidl's departments and subcategories with product counts
A price-comparison / catalog builderHard-coding category paths breaks whenever the site is reorganizedRe-map the tree on a schedule; detect added/removed/renamed categories automatically
A developer building a pipelineYou need a reliable seed list to drive a larger crawlOne normalized record per category — IDs, parents, levels — to orchestrate the rest

What data you get

Each discovered category becomes one dataset record:

FieldDescription
idLidl's stable category ID, with type prefix: s… for a main category, h… for a leaf subcategory (the record's unique id)
nameCategory display name (e.g. Fruit & Groenten)
slugURL slug (e.g. fruit-groenten)
parentIdThe parent category's id — null for a main category, the s… id for a leaf
levelTree depth: 0 = main department, 1 = leaf subcategory
type"main" or "leaf"leaf categories are the ones that actually hold products
productCountApproximate number of products in the category (may be null if not reported)
urlCanonical lidl.nl category page link
apiPathA relative path for this category — handy if you build your own integrations
sourceConstant tag identifying the producing Actor
scrapedAtISO 8601 timestamp of the run
rawDataOptional: a small raw payload describing the category, when Include raw payload is on

Example output

// A main department (top of the tree)
{
"id": "s10068374",
"name": "Eten & Drinken",
"slug": "eten-en-drinken",
"parentId": null,
"level": 0,
"type": "main",
"productCount": 412,
"url": "https://www.lidl.nl/c/eten-en-drinken/s10068374",
"scrapedAt": "2026-06-18T08:30:00+00:00"
}
// A leaf subcategory under it — this is what you feed the product scraper
{
"id": "h10071012",
"name": "Fruit & Groenten",
"slug": "fruit-groenten",
"parentId": "s10068374",
"level": 1,
"type": "leaf",
"productCount": 87,
"url": "https://www.lidl.nl/h/fruit-groenten/h10071012",
"scrapedAt": "2026-06-18T08:30:00+00:00"
}

Categories that could not be retrieved are not written to the dataset (and never billed). They are listed in the run's SUMMARY record in the key-value store — { "failures": [ { "input": "…", "error": "…" } ] } — and the run's status message tells you at a glance how many categories were returned.


How to use it (60 seconds)

  1. Click Try for free / Start.
  2. There's nothing required to configure — the Actor returns the whole tree by default. (Optionally cap the run with Max items, or turn on Include raw payload.)
  3. Leave the Automatic (datacenter) proxy as-is — it works out of the box. (Only if you hit blocked requests, switch to a Dutch residential proxy.)
  4. Click Save & Start. Download results as JSON, CSV, Excel, or via API from the Dataset tab; check Key-value store → SUMMARY for run totals and any failed requests.
  5. Next step: take the id of every type: "leaf" record and feed it to the Lidl Product Scraper to pull the products in those categories.

Input reference

FieldTypeDefaultDescription
Include raw payloadbooleanfalseAdds a small raw payload describing each category under rawData. Leave off unless you need it.
Proxy configurationobjectApify AutomaticAutomatic (datacenter) proxies work fine and keep costs low. Switch to a Dutch (NL) residential proxy only if you see blocked requests.
Max concurrencyinteger4Parallel requests (1–20). Kept moderate to be respectful.
Delay between requestsinteger0Politeness delay in seconds before each request (0–10). 0 is fine at moderate concurrency.
Max itemsinteger0Stop after N category records (0 = unlimited; the full tree).

There is no category-list or keyword input: this Actor's job is to map the whole tree, so it always returns the full category map.


What you get out of it

Lidl's category structure isn't a static, published list — and it changes over time. This Actor takes care of assembling it for you and returns a clean, de-duplicated map every run: main departments first, then their leaf subcategories, each as one record with its id, name, parent, level, type, product count, and link. There's nothing to configure — just start the run and read the dataset.


Pricing — what a run costs

This Actor uses transparent pay-per-event pricing with a built-in volume discount: a small Actor-start fee, a fixed price per category record returned, and a cheaper rate for results beyond a large per-run threshold. No subscription, no minimums, and failures are never charged. The exact per-result rate is shown on the Actor's Pricing tab.

  • A full tree is a small, cheap run. Lidl has on the order of dozens-to-low- hundreds of categories, so a complete run is inexpensive — its real value is as the seed for a much larger product crawl.
  • Failed requests are free. Any category that can't be retrieved is reported in the summary, never billed.
  • Try it free: an Apify free account includes $5 of monthly platform credit — enough to run this Actor many times before paying anything.
  • Stay in control: set Max items and Apify's maximum charge per run; the Actor stops gracefully at your cap, keeping everything already discovered.

Compared to the alternatives

This ActorHard-code the category listBuild your own crawlerClick through the site
Always-current treeYes (refreshed every run)Goes stale on every reorgYou maintain itTedious, error-prone
Main + leaf, with parent linksYesPartial at bestYou maintain itManual note-taking
Stable IDs ready for the product scraperYesYou look them up by handYou extract themCopy-paste per category
Works out of the boxYes (built in)You build and maintain it
Never billed for failuresYes
Export JSON / CSV / Excel / APIYes, built-inDIYDIYCopy-paste
Setup time~60 secondsHours, recurringDays; breaks on site changesHours per refresh

Integrate the data

  • Feed the Lidl Product Scraper (the main use case). Pull this dataset, keep the records where type == "leaf", and pass their id values as the category input to the Lidl Product Scraper. You now have a fully automated map-then-scrape pipeline that adapts when Lidl reorganizes.
  • Exports: JSON, CSV, Excel, XML from the Dataset tab — or fetch programmatically:
    GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
  • Run on a schedule: use Apify Schedules to refresh the tree weekly and webhooks to kick off the product scraper automatically when the run finishes.
  • From code: call the Actor with the Apify API or SDKs (Python / JavaScript) and read the dataset when the run finishes.
  • Run summary: every run writes a SUMMARY record (key-value store) with totals, successes, failures, and billing counts — ideal for monitoring automated pipelines.

FAQ

What does this Actor actually return — products or categories? Categories. One record per category (main departments and leaf subcategories), not products. To get products, feed the leaf category IDs into the Lidl Product Scraper.

Do I need a Lidl account or login? No. There is no login or key to manage — just start the run.

Do I need a Dutch proxy? No. The default Automatic (datacenter) proxy works fine — Lidl's API returns the national (NL) category tree regardless of where the request comes from. Only if you ever see blocked requests, switch to a Dutch (NL) residential proxy.

What's the difference between a main and a leaf category? main (level 0, id s…) are the top departments shown on the homepage; leaf (level 1, id h…) are the subcategories under them — the ones that actually hold products. For scraping products, you want the leaves.

How many categories will I get? The full Lidl tree — typically dozens of departments and subcategories. The non-food assortment is broad; the food assortment rotates weekly and can be small, which is normal (a small or empty category is not an error).

Why is productCount sometimes null? When a product count isn't available for a category, the field is null. The category is still included.

How fresh is the data? Each run returns the tree as it stands at that moment, so you get exactly the structure lidl.nl shows then. Schedule the Actor to keep your map as fresh as you need.

What formats can I export? JSON, CSV, Excel, XML — from the Console or via the Apify API.


Pair this with the Lidl Product Scraper — that's the whole point. Run this Category Scraper first to map the tree, then feed every type: "leaf" id into the:

  • Lidl Product Scraper — turns those leaf category IDs into full product records (titles, prices, Lidl Plus member prices, unit prices, images). Map once here, scrape products there: a fully automated, reorg-proof map-then-scrape pipeline.

Building broader Dutch-supermarket coverage? The same clean schema and pay-per-event billing run across every chain — mix and match for full market coverage:

  • Albert Heijn Product Scraper
  • Plus Product Scraper
  • Dirk van den Broek Product Scraper
  • DekaMarkt Product Scraper
  • Hoogvliet Product Scraper — and its own Hoogvliet Category Scraper, the family's other category-tree actor (same map-the-tree-then-scrape pairing as this one).

Same clean schema and pay-per-event billing across every Dutch chain — and you're never billed for failed requests in any of them.


Disclaimer

This Actor is intended for personal and research use. You are responsible for ensuring your use complies with Lidl's terms and applicable law. Please scrape responsibly — keep concurrency moderate and delays reasonable. This project is not affiliated with, endorsed by, or sponsored by Lidl.