Hoogvliet Category Scraper avatar

Hoogvliet Category Scraper

Pricing

from $5.00 / 1,000 category results

Go to Apify Store
Hoogvliet Category Scraper

Hoogvliet Category Scraper

Scrape Hoogvliet's full category tree (hoogvliet.com): every category and subcategory with name, hierarchical URI, parent and level. Clean JSON/CSV, ideal as input for the Hoogvliet product scraper. Needs a Dutch (NL) proxy. Failed lookups are never billed.

Pricing

from $5.00 / 1,000 category results

Rating

0.0

(0)

Developer

Elena Vance

Elena Vance

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Hoogvliet Category Scraper — Map the Full Hoogvliet.com Category Tree

Discover Hoogvliet's entire category structure (hoogvliet.com) as clean, structured JSON or CSV: every category and subcategory with its name, the link you need to address it, its parent category, its depth in the tree, and a flag telling you whether it holds products — one tidy record per category node.

This is the map of the store, not the products on the shelves. The output is the ideal input for the Hoogvliet Product Scraper: pipe the category links straight in and let it pull the products. Use it on its own to analyze site structure, or as the first step of a two-stage pipeline. No login, no HTML wrangling — and you are never billed for failed requests.

Good to know: this Actor runs through a Dutch residential proxy — already pre-selected in the input, just keep it enabled.


Why this Actor

  • The whole tree in one run. Every node Hoogvliet exposes — top-level departments down to the deepest subcategories — flattened into one record per category. No clicking through menus.
  • The exact link you need to fetch products. Bare category ids simply 404. Each record carries the working uri — the single thing the product scraper needs to read a category's products.
  • Know which categories actually hold products. Every record has a hasOnlineProducts flag, so you can feed the product scraper only the leaves that carry products instead of crawling the whole tree blindly.
  • Reconstruct the hierarchy. parentId and level let you rebuild the tree exactly — breadcrumbs, navigation, a category picker, or a coverage report.
  • The perfect companion to the product scraper. Run this once to get the map, then drive the Hoogvliet Product Scraper with the links — or schedule it to catch new categories as Hoogvliet adds them.
  • You never pay for failures. Timeouts and other transient errors are reported in the run summary — not written to your dataset and not billed.
  • Fast and browserless. No headless browser — so a full run finishes in seconds and compute stays minimal.

Problems this Actor solves

If you are…Your problemHow this Actor solves it
Running the Hoogvliet product scraperYou need the category links to scrape, and bare ids don't workOne run hands you every category's working link — paste them straight into the product scraper
Building a price/assortment pipelineYou want to scrape only product-bearing categories, not the whole treeFilter on hasOnlineProducts and feed just the leaves downstream
A market researcher / analystUnderstanding how a competitor organizes its assortment is manual and partialA complete, dated map of the category hierarchy — export to pandas, Sheets, or BI
Maintaining a category mappingHoogvliet adds, renames, and reshuffles categories over timeSchedule a run and diff the tree to catch structural changes automatically
An app / chatbot / agent developerYou need Hoogvliet's navigation structure without building a crawlerPay per category record on demand; a stable, normalized schema you can rely on

What data you get

Each category node becomes one dataset record:

FieldDescription
id / externalIdHoogvliet's category id (the record's unique id)
nameCategory display name (e.g. Zuivel, eieren, boter)
uriThe full category link — required to address the category; bare ids 404
parentIdThe id of the parent category (null for top-level nodes) — rebuild the tree from this
levelDepth in the tree (0 = top level, increasing downward)
hasOnlineProductstrue when the category carries products (the leaves to feed the product scraper)
hasOnlineSubCategoriestrue when the category has child categories
categoryUrlThe full URL for the category node
sourceConstant tag identifying the producing Actor
scrapedAtISO 8601 timestamp of the run
rawDataOptional: the raw category fields, when Include raw category payload is on

Example output

{
"id": "100495",
"externalId": "100495",
"name": "Verse zuivel",
"uri": "org-webshop-Site/-/categories/schappen/100/100495",
"parentId": "100",
"level": 1,
"hasOnlineProducts": true, // a leaf — feed this link to the product scraper
"hasOnlineSubCategories": false,
"categoryUrl": "https://www.hoogvliet.com/...&/categories/schappen/100/100495",
"source": "hoogvliet-category-scraper",
"scrapedAt": "2026-06-18T08:30:00+00:00"
}

Categories or subtrees that could not be fetched are not written to the dataset (and never billed). They are listed in the run's SUMMARY record in the key-value store —

{ "failures": [ { "input": "…", "source": "hoogvliet-category-scraper", "error": "…" } ] }
— and the run's status message tells you at a glance how many categories were mapped.


How to use it (60 seconds)

  1. Click Try for free / Start.
  2. Choose what to map:
    • The whole tree (default): leave Category URIs empty to emit every category in the entire tree.
    • A subtree: add one full category URI per line under Category URIs (e.g. org-webshop-Site/-/categories/schappen/100) to emit only those categories and everything beneath them.
  3. Keep the Dutch residential proxy enabled (required — see above).
  4. Click Save & Start. Download results as JSON, CSV, Excel, or via API from the Dataset tab; check Key-value store → SUMMARY for run totals and any failed requests.
  5. Next step: copy the uri values (filter on hasOnlineProducts if you only want product-bearing ones) and paste them into the Hoogvliet Product Scraper.

Input reference

FieldTypeDefaultDescription
Category URIslist(empty)One full category URI per line to emit only those categories and everything beneath them. Empty = emit the entire tree. Bare category ids do not work — use the full URI.
Include raw category payloadbooleanfalseAdds the raw category fields under rawData. Leave off unless you need them.
Proxy configurationobjectApify Residential, NLRequired — keep the Dutch residential default unless you route NL traffic another way.
Max concurrencyinteger5Parallel requests (1–20). Kept moderate to be respectful.
Delay between requestsinteger0Politeness delay in seconds before each request (0–10). ~0.15s is recommended; 0 is fine at moderate concurrency.
Max itemsinteger0Stop after N category records (0 = unlimited / the whole tree).

Pricing — what a run costs

This Actor uses transparent pay-per-event pricing with a built-in volume discount: a small Actor-start fee, a fixed price per category record for the first results of a run, and a cheaper rate for every result beyond a high threshold. No subscription, no minimums, and failures are never charged. The exact per-result rate is shown on the Actor's Pricing tab.

  • A category map is small and cheap. Hoogvliet's whole tree is a few hundred categories, so a full run is a tiny, predictable cost — and it is the input that unlocks the much larger product scrape.
  • Failed requests are free. Timeouts and other transient errors are reported in the summary, never billed.
  • Bigger runs cost less per item. The volume-discount tier resets per run (though most category runs stay well under it).
  • Try it free: an Apify free account includes $5 of monthly platform credit — more than enough to map the tree before paying anything.
  • Stay in control: set Max items and Apify's maximum charge per run; the Actor stops gracefully at your cap, keeping everything already scraped.

Compared to the alternatives

This ActorBuild your own crawlerClick through the site by hand
Full category tree, normalizedYesYou maintain itImpractical
Working full links (not 404-ing ids)YesYou build and maintain it
hasOnlineProducts / parentId / levelYesYou maintain it
Dutch residential proxy built inYesYou source proxies
Feeds the product scraper directlyYesDIY glueCopy-paste
Never billed for failuresYes
Export JSON / CSV / Excel / APIYes, built-inDIYCopy-paste
Setup time~60 secondsDays; breaks when the site changesHours

Integrate the data

  • Feed the Hoogvliet Product Scraper (the main use case). Run this Actor, take the uri field from each record (optionally filter on hasOnlineProducts == true to scrape only product-bearing categories), and pass those URIs as the category input to the Hoogvliet Product Scraper. That two- stage pipeline — map the tree, then scrape the products — is exactly what these two Actors are designed for.
  • Exports: JSON, CSV, Excel, XML from the Dataset tab — or fetch programmatically:
    GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
  • Run on a schedule: use Apify Schedules to refresh the category map periodically, and webhooks to push finished runs into your pipeline (or to trigger the product scraper automatically).
  • From code: call the Actor with the Apify API or SDKs (Python / JavaScript) and read the dataset when the run finishes.
  • Run summary: every run writes a SUMMARY record (key-value store) with totals, successes, failures, and billing counts — ideal for monitoring automated pipelines.

FAQ

Is this the product scraper? No — this Actor maps the category tree (one record per category). To get the products themselves, use its companion, the Hoogvliet Product Scraper, and feed it the category URIs this Actor produces.

Do I need a Hoogvliet account or API key? No. There is no login.

Why is a Dutch proxy required? The Actor ships with a Dutch residential proxy pre-selected; keep it on. Without it, requests are blocked and the run returns no categories (the block is reported in SUMMARY).

Why can't I just use a category id? Bare category ids return 404. That is exactly why this Actor exists — it captures the working uri for every category so you don't have to.

What does hasOnlineProducts mean? It is true when the category actually holds products. Filter on it to feed the product scraper only the leaves that have something to scrape.

Does the output include every category or only the ones with products? Every node in the tree — including parent/department categories that hold only subcategories. The hasOnlineProducts and hasOnlineSubCategories flags let you filter to whatever you need.

How fresh is the data? Each run fetches the live tree, so you get exactly Hoogvliet's current category structure. Schedule the Actor to catch new or renamed categories.

What formats can I export? JSON, CSV, Excel, XML — from the Console or via the Apify API.


  • Hoogvliet Product Scraper — its companion (run this one first). This Category Scraper is stage one of a two-stage pipeline: run it to map the tree, then feed the uri of each category (filter on hasOnlineProducts == true to scrape only product-bearing leaves) into the Hoogvliet Product Scraper to pull the actual products, prices, and promotions. Map the store, then scrape the shelves.
  • The wider Dutch supermarket family. Product scrapers for every major Dutch chain — Albert Heijn, Lidl, Plus, Dirk van den Broek, and DekaMarkt Product Scrapers — plus the Lidl Category Scraper, the other category-tree mapper (same map-then-scrape pairing as Hoogvliet's two Actors).
  • Same clean schema and pay-per-event billing across every Dutch chain — mix and match for full market coverage, and you are never billed for failures on any of them.

Disclaimer

This Actor is intended for personal and research use. You are responsible for ensuring your use complies with Hoogvliet's terms and applicable law. Please scrape responsibly — keep concurrency moderate and delays reasonable. This project is not affiliated with, endorsed by, or sponsored by Hoogvliet.