Pricing

from $4.00 / 1,000 results

REMA 1000 Scraper - Danish Grocery Products & Prices

Scrape rema1000.dk — Denmark's largest discount-grocery chain with 3900 SKUs. Search products by ingredient, monitor tilbud price changes, and track weekly basket costs over time for grocery planning and price alerts.

Pricing

from $4.00 / 1,000 results

Rating

0.0

(0)

Developer

Black Falcon Data

Actor stats

Bookmarked

Total users

Monthly active users

9 hours ago

Last modified

0.6.7 — Pack-size shrinkflation now classifies as UPDATED

Pack-size changes now show up as UPDATED in incremental mode. A product that goes from 4 to 3 sachets per pack at an unchanged shelf price (classic shrinkflation) previously classified as UNCHANGED — invisible if you were watching the lifecycle feed. It now fires UPDATED with previousSeenAt so downstream consumers can diff the old vs new pack-size against the unchanged price.
No output or input changes — same fields, just stricter change detection.

0.6.6 — README + schemas catch-up on the 0.6.x additions

README key feature for current price now headlines the discountPercent capability — was buried inside the compact-mode bullet before. The grocery-current-price catalog entry calls out concrete numbers (Bearnaise 61.5%, Tuborg 31.5%) and points at the discount-desc sort option for ranking deals.
Example record now leads with a tilbud product (Bearnaise on -61.5%) so users see discountPercent, priceOverMaxQuantity, and maxQuantity populated with real values instead of a string of nulls.
Schemas were already in sync — input form has all 4 new 0.6.x fields (sortBy, ingredientFilter, quantities, compact), dataset Overview exposes discountPercent next to the campaign flags, and the auto-generated input descriptions are current.
Base feature verification — passing. All applicable baseline checks pass (basic, compact, incremental seed/verify, repost fields).

0.6.5 — Ralph-loop hardening on the 0.6.x additions (3 rounds)

Round 1

EXPIRED rows now respect sortBy in incremental mode. Previously the sort applied only to live items; EXPIRED synthetics were appended after Pass 2 and always landed at the end regardless of direction. The sort now re-runs across the merged set after Pass 2, so EXPIRED rows interleave by the chosen criterion (null prices still sink — they can't carry a discount or per-unit number).
ingredientFilter JSON-parse failures no longer fail silently. If a user supplied a string that opens with [ but doesn't parse as valid JSON (common: missing quotes around terms), the parser used to fall back to a literal substring match against the malformed [...] string, which always returned zero matches. Now logs an explicit warning telling the user the correct shape (["pistacie", "hasselnød"]).

Round 2

Notification titles now carry the discount percentage for campaign rows. "JORDBÆRMARMELADE — 285 GR. / EASIS (-61.5%)" reads stronger than the same line without the deal signal. Skipped when no real discount was computed (excludes non-campaign rows and SKUs where REMA didn't expose a regular price).
2 new tests cover the title-with-discount and the null-discount fallback.

Round 3

No new findings — re-audited sort + cap interactions, compact + sort ordering, EXPIRED sort placement, attachLineEconomics ordering vs sortItems, and Algolia concurrency. All clean. Closing the loop.

0.6.4 — Audit metadata cleanup

Expanded the Store description to the recommended SEO length while keeping the same feature claims.
Synced package-lock version metadata and Actor manifest minor version with the 0.6.x release line.

0.6.3 — Compact mode promoted + documented

Compact mode already existed, but the description didn't sell the value. Now: "Cuts each row from ~2,850 chars to ~650 (77% smaller) — ideal for AI agents, MCP servers, and LLM context windows where you don't need allergen text." Verified live against a real Bearnaise basket row.
Compact mode is now a top-level key feature in the README, alongside basket-tracking and ingredient-search. Dropped compare-unit from the keyFeatures list (it's already covered by current-price) to stay under the 10-feature cap.
COMPACT_FIELDS already includes the new fields shipped over 0.5.x → 0.6.x (discountPercent, quantity, lineSubtotal, lineDeposit, lineSavings, lineTotal, unitSize, unitMeasure, brand) — verified by running compact mode against a tilbud basket row.

0.6.2 — Discount percentage

New always-present field discountPercent — campaign discount as a percentage off the regular price. Computed as (priceOverMaxQuantity − price) / priceOverMaxQuantity × 100, rounded to 1 decimal. Verified live: Bearnaise tilbud reports -61.5% off, Tuborg -31.5%, Kyllingelår -35.5%.
Null on non-campaign rows and when the regular price isn't exposed. Independent of basket quantity — pure per-unit ratio, so it's meaningful in browse / search modes too, not just basket.
New sort option discount-desc — rank biggest % off first. Useful for "show me this week's deepest tilbud" queries without writing your own sort step downstream.
Overview dataset view shows the new column ("% off") next to the campaign flags.

0.6.1 — Multi-ingredient AND, sort order, ingredient search promoted

Multi-ingredient filter

ingredientFilter now accepts an array of terms with AND semantics — every term must appear in the ingredient text. Verified live against ~94 ice-cream products: ["pistacie", "hasselnød"] returns 31 products containing both, narrower than the 34 pistacie-only set. Useful for allergen cross-checks, premium ingredient combinations, and preservative-pair audits.
Single-string form ("pistacie") still works — internally normalized to a 1-element array.

Sort order

New input ↕️ Sort order. Eight options: price asc/desc, per-unit price asc/desc, basket line-total asc/desc, biggest tilbud savings first, name (Danish locale collation). Null / missing values always sink to the bottom regardless of direction, so a sort never re-orders missing data over present data.

Ingredient search promoted

Added as a top-level feature in the README — was buried as one input field bullet before. Many real use cases (allergen tracking, premium ingredient discovery) lead with this capability.

Description cleanup

Removed bottle deposit from the Store description — the deposit data is still there in the output, just not part of the one-liner pitch.

0.6.0 — Ingredient filter + 50% throughput boost

Ingredient filter

New input 🥗 Ingredient filter — case-insensitive substring match on the ingredient list. Useful for finding products that contain a specific ingredient that isn't part of the product name. Examples: "pistacie" finds chocolate, ice cream, and granola bars containing pistachios; "kaliumsorbat" finds every preservative-treated product; "hasselnød" catches Nutella-style spreads, granolas, and bakery items.
Filter runs after fetch (REMA's own search engine indexes product names and categories, NOT ingredients — we verified directly against the source). Narrow scope first with query or productIds to keep cost down.
Only meaningful in basket and search modes — browse-mode REST data doesn't carry ingredient text.
3 new tests covering the input normalizer.

Concurrency

Bumped page-walk parallelism from 8 to 12 based on a 5-tier benchmark against the store's API. 12 gives a 2.4× speedup with zero errors. 16 is marginally faster but no real gain. 24+ starts crowding the throttle envelope; 32 produces ~50% connection failures.
Full-catalog scrape (3895 products) was 34s at concurrency=8 — should be substantially faster at 12.

0.5.8 — README + Store example: 10-product basket with quantities

Added a worked 10-product basket example to the README and the Apify Store "Try it" examples. Mix exercises every basket-summary code path: 2 tilbud-with-pant rows (Tuborg 6-pak), 2 tilbud-without-pant (Bearnaise, kyllingelår), 2 normal-with-pant (Icetea, Ribena), and 4 normal items. Quantities range from 1 to 6 per product (24 total units across 10 lines).
Worked numbers for the example: subtotal 330.03 + pant 22.50 = total 352.53 DKK, saving 123.30 DKK on 8 tilbud-units. Numbers come from a real live run, so users can verify against the live catalog.

0.5.7 — Ralph-loop hardening (3 rounds, even LOW issues)

Round 1

EXPIRED quantity leak — synthetic EXPIRED rows in search/browse mode could surface quantity from a stray quantities input, even though quantities semantically belong to basket mode only. Now mode-gated: EXPIRED carries quantity only in basket mode, null elsewhere.
Duplicate productIds warning — pasting "21464, 21464" to mean qty=2 used to dedup silently. Now logs a warning pointing the user at the quantities input as the correct way to express counts.

Round 2

Summary KV write could silently 4xx on certain stateKeys — if a user picked a stateKey containing / or other KV-illegal characters, the basket-summary persistence would fail without any signal. Now the KV key is base64url-encoded (summary__<base64url(stateKey)>) so any user-supplied string is safe; the original stateKey is still stored inside the record for human reference. Write failures now log a warning instead of being silently swallowed.

Round 3

No new findings — auditing the remaining call paths (notifications, EXPIRED-with-empty-id, lock release ordering, status-message length, findExpiredJobs short-circuit, state-key migration) all came up clean. Closing the loop.

0.5.6 — audit-7 basket summary hardening

Fixed incremental basket summaries so unchanged products still count toward the current basket subtotal, pant, total, and campaign savings. Summary now rolls up from the current basket universe instead of only the lifecycle rows emitted to the dataset.
Added an explicit warning when quantities is supplied outside basket mode, where it is ignored.
Updated the Actor input schema so the documented quantities object, array-of-pairs, and JSON-string forms are all accepted at validation time.
Synced package-lock and Actor manifest version metadata with the 0.5.x line.

0.5.5 — `quantities` also accepts array-of-pairs form

The quantities input can now be supplied as an array of {id, qty} pairs in addition to the existing object map. Convenient for callers (Make / n8n / Zapier flows) where the data already arrives in row-shaped form.

// Both of these are equivalent:
"quantities": {"200313": 12, "21464": 6}
"quantities": [{"id": 200313, "qty": 12}, {"id": 21464, "qty": 6}]

Pair keys accept synonyms: id / productId / product_id, and qty / quantity / count.
Invalid entries are dropped silently (and reported in the aggregated warning line); valid entries from the same array are kept.
4 new tests covering the array form. Total: 191 tests.

0.5.4 — Codex audit-6 fixes

Per-row line economics on the dataset

Each basket-mode row now carries its own quantity, lineSubtotal, lineDeposit, lineSavings, and lineTotal. CSV/Excel users see "12 Tuborg, line total 432 DKK" inline on the row — no need to cross-reference the summary KV record. Same source of truth (computeLine()) drives both per-row attribution and aggregate roll-up, so the numbers always agree.
All five fields are null outside basket mode (per the "all fields always present" schema contract). EXPIRED synthetics carry the user's intended quantity but null line totals (no current price to compute against).
Overview dataset view exposes the new columns next to price/comparePrice.

Locked invariants

Regression test: quantities is explicitly excluded from the incremental state scope. Changing "12 Tuborg" to "6 Tuborg" must NOT reset the basket's price-change history. (Verified: basket scope is [campaignsOnly] only.)
Quantities are still summary-only as far as state goes — a qty change won't trigger a spurious UPDATED.

Validation logs

Invalid quantities (negative, non-numeric, malformed keys) are now reported in one aggregated log.warning line with the dropped entries listed, instead of being silently dropped.

Backwards compatibility

Existing dataset consumers get five new always-present fields. They're null outside basket mode, so non-basket records are unaffected in semantic content. The schema contract (all fields always present) is preserved.

0.5.3 — `quantities` is now an actor input

New input field 🔢 Basket — quantities per product. JSON object mapping productIds to integer counts, e.g. {"200313": 12, "21464": 6} = 12 Tuborg + 6 mælk. Drives subtotal, pant total, and over-cap campaign math. Default qty=1 per product when omitted. Ignored outside basket mode.
Accepts both an object ({"200313": 12}) and a JSON-string form ('{"200313": 12}') for flexibility when posted programmatically.
Fractional values are floored; negative / non-numeric entries are dropped silently.
5 new tests covering the input parser. Total: 186 tests.

0.5.2 — Basket summary: per-product quantities + max-cap rules + clearer counts

Quantities

buildBasketSummary now accepts an optional quantities map ({productId: count}). Each product line is multiplied by its own quantity — the map is per-product, not a single aggregate number. (Not yet wired as an actor input field — usable from the SDK / helper only; let me know if you want it as a real input.)
Subtotal, pant, and savings all respect quantities. Default qty=1 per product when no map is supplied.

Max-quantity (over-cap) rules

Many REMA tilbud have a per-customer cap (maxQuantity). When you buy more than the cap, the first N units get the campaign price; overflow units get billed at priceOverMaxQuantity. The summary now applies this rule. Example: 8× Bearnaise on tilbud 10 DKK with max=6, regular 25.95 → 6×10 + 2×25.95 = 111.90; savings only on the capped 6 units (95.70), not on the 2 overflow units.

Clearer counts

Split itemCount into two fields so "lines vs units" isn't ambiguous:
- productCount — distinct product lines (e.g. 4 = Tuborg, Knorr, milk, jam)
- unitCount — total units summed across quantities (e.g. 23 = 12+8+2+1)
itemCount is kept as an alias for unitCount for backwards compatibility through 0.5.x.
Log line collapses to one number when every product is qty=1 (the common case), otherwise reports both: 4 product(s), 23 unit(s).

0.5.1 — Basket summary now reports both subtotal and total

Pant (Danish bottle deposit) is paid at the register but refunded when bottles are returned, so there's no single "right" number for a basket total. The summary now exposes both:

subtotal — cost of goods only (pant refunded on return)
depositTotal — pant component, isolated
total — subtotal + depositTotal — what you actually pay at the register today

Log line and status message show both numbers when pant is non-zero, e.g. subtotal 74.45 DKK + pant 6.00 = total 80.45 DKK. When the basket has no pant, only total is shown.

0.5.0 — Basket totals + savings + pant rollup

Basket mode now reports a per-run summary so a weekly shopping-list monitor doesn't have to sum its own rows.

Subtotal in DKK across all priced items (EXPIRED rows are skipped — they have no current price).
Pant total rolled up separately so the deposit cost is visible without scanning each item.
Campaign savings — for each isCampaign item with a priceOverMaxQuantity, sums (priceOverMaxQuantity − price) so you can see what your tilbud-tracking actually saved you this week.
Delisted count — number of basket IDs that the catalog no longer recognizes.

Where the summary surfaces:

Logged on the run page as Basket — 3 item(s) · subtotal 73.45 DKK · saving 8.00 DKK on 2 tilbud · 1 delisted.
Set as the run's status message (visible in Apify Console without opening the log).
Persisted to the key-value store under <stateKey>-summary so users can fetch the latest snapshot via the KV API between runs.

Only fires in basket mode (productIds). Search and browse runs don't get a basket summary — the concept doesn't apply.

0.4.9 — Fix: labels populated on basket + search products

labels[] on basket and search records previously emitted only {image: null} entries — the slug strings the rich-data source returns were being shoved into a slot that expected an {id, name, image} object, so the name and id were both undefined and JSON-serialized them away. Labels now carry the slug ("rema1000", "keyhole", "no_added_sugar", "ecocert", etc.) as name, with id and image null — usable for filtering and grouping where they were empty before. Browse-mode labels (REST source) are unaffected; they already had full {id, name, image} data.
Type update: RemaLabel.id is now number | null to reflect that the rich-data source doesn't expose numeric ids.

0.4.8 — Critical fix: basket/search compare-price now matches the shelf

Hardening

comparePrice and compareUnit in basket + search modes were wrong. Both fields are sourced from a per-product detail block that exposes two fields: an internal kg-normalized number, and the canonical "X.XX per ." string that consumer-law requires on the shelf label. The previous code path used the internal number AND always labelled it "kg", producing two wrong values per product:
- 1-liter milk: previously reported "kg" + 10.14; correct is "ltr" + 10.50.
- 285 g jam jar: previously reported "kg" + 79.83; correct is "kg" + 84.04.
- 200 ml plant milk: previously reported "kg" + 32.95; correct is "ltr" + 32.95.
The fix parses the canonical shelf-label string instead, so comparePrice / compareUnit in basket and search modes now match what's printed on the price tag in-store — same as browse mode already did. Verified across a 1000-product sample at 100% parse rate.
5 new tests in tests/parseAlgoliaPricePerUnit.test.ts lock the parser invariants. Total: 167 tests.

Compatibility

Existing dataset records produced before 0.4.8 will have the old wrong values. Incremental-mode users will get UPDATED records on the next run for every basket/search product whose compare-price was previously miscalculated.

0.4.7 — Audit-5 fixes (lock hygiene + mode-conflict surfacing)

Hardening

Lock leak on empty-results early exit fixed. When an incremental run fetched zero current products and had no prior state, it exited without releasing the state lock. Subsequent runs against the same stateKey would hit RMA-0030 (lock conflict) until the 30-minute stale-lock TTL recovered them. The empty-results path now releases the lock before exiting.
releaseLock now verifies ownership before clearing. Previously the catch-block release wrote null unconditionally, which could clobber another run's lease if ours had been stale-overridden mid-execution. Release now reads the current lock and skips if the runId no longer matches — preserving the new owner's lease.

UX

Mode-conflict inputs now warn. If a run supplies both productIds and query, basket mode wins (unchanged behavior) but a warning now logs that query was ignored. Same for departmentIds outside browse mode.
REST-fallback path now preserves isBatchItem + isAvailableInAllStores. When the rich-data Algolia endpoint 404s on a SKU but REST has it, the synthesized record was hard-coding these two booleans to defaults. They now come from the real REST payload.

0.4.6 — Audit-4 fixes

Hardening

Search-discovery failure no longer false-expires prior state. If keyword search failed transiently mid-run, the catch block logged the error but didn't flag coverage as incomplete. Combined with a successful prior state, every previously-active search hit would be classified as EXPIRED on the next run. Coverage is now explicitly marked incomplete on this path, so EXPIRED detection short-circuits until search recovers.
Unparseable-productUrls early exit now emits run.complete telemetry. Previously the guardrail exited before run.start/run.complete fired, leaving the run absent from ops telemetry except for the RMA-0060 error event. The guard now runs after run.start, sets runErrored = true, and finalizes diag before exit.

Compact mode

Added lifecycle fields (firstSeenAt, lastSeenAt, expiredAt) to compact output. EXPIRED records in compact mode previously kept changeType: "EXPIRED" but dropped the timestamp context that makes a delisting alert actionable.

0.4.5 — Audit pass: unit/brand fields wired everywhere

Catalog entry for Structured unit size + brand (priority 88) so the README key-features list highlights the parsing prominently. Full-catalog scan confirmed 14 distinct unit tokens (gr / stk / ml / cl / ltr / kg / par / mtr / pk / sæt / bakke / bdt / pose / rl) at 100% parse rate across 3895 SKUs.
Compact mode now includes unitSize, unitMeasure, brand, comparePrice, and compareUnit — per-unit price is as much a shopper essential as the absolute price.
Field-coverage tests assert that unitSize, unitMeasure, and brand are populated in at least one fixture (was unasserted before).
Internal evidence catalog updated to the full-catalog frequency table rather than a 2000-sample subset.

0.4.4 — Structured unit size + brand

New fields: unitSize, unitMeasure, brand. The shelf-label string (e.g. "1 LTR. / REMA 1000", "285 GR. / EASIS") is now parsed into a numeric quantity, a normalized unit token (ltr, gr, kg, ml, cl, stk, par, rl, mtr, sæt), and the brand text after the slash. Combined with the existing comparePrice + compareUnit fields, this gives shoppers a clean "10.50 DKK per liter, 1-liter pack, REMA 1000 brand" view without regex on the subtitle.
Brand is nullable — some SKUs ship a brand-less subtitle like "33 CL." (typical for soft-drink cans). Parser also handles decimal sizes ("1,5 LTR.", "56.8 CL.") and preserves Danish characters in brand names.
Overview view in the dataset table now exposes Unit Size, Unit, and Brand as dedicated columns next to the raw subtitle.
6 new tests in tests/parseSubtitle.test.ts. Total: 159 tests.

0.4.3 — README + evidence catch-up

README now highlights ingredients/nutrition, GPSR manufacturer, and EAN/UPC barcodes as top-level features (previously buried in the field table).
Internal evidence catalog back-filled for every product-detail field (ingredients, nutrition, barcodes, countryOfOrigin, categoryId, categoryName, manufacturer, warnings, itemDisclaimer) and the synthesized lifecycle fields (firstSeenAt, lastSeenAt, expiredAt).
docs/evidence/local/sample-output.json regenerated to lead with two rich Algolia products (cosmetics SKU with GPSR block, jam SKU with ingredients + nutrition + barcodes) so the example record in the README shows the fields populated rather than null.
Added unit tests for sanitizeInputForDiag (notification-secret redaction) and extractUrls (social-platform partitioning + tracking-domain filter). Total: 153 tests.
Pricing analysis aligned to $0.004/result (no direct competitor; operator premium for ingredients + GPSR + barcodes coverage).

0.4.2 — Audit-3 fixes (false-EXPIRED hardening + test coverage)

Hardening

Fetch failures now downgrade coverage to incomplete. Per-product Promise rejection (transient Algolia/REST outage) was previously swallowed silently. Combined with COMPLETE_COVERAGE, a transient failure could classify an active product as EXPIRED on the next incremental run. fetchByIdsViaAlgolia and fetchDepartmentProducts now return a fetchErrors counter; coverage is COMPLETE only when both !capHit && !searchTruncated && fetchErrors === 0.
maxResults: 0 in search now honours up to 5000 hits. Previous ceiling was 1000 — the documented "unlimited" semantics now match Algolia's hard cap, and any truncation beyond 5000 keeps coverage incomplete to prevent false EXPIRED.
Notifications no longer render a misleading price range. salaryMax is now always null (was priceOverMaxQuantity); only salaryMin carries the price so Telegram/Slack/Discord/WhatsApp formatters don't synthesize an "X–Y" band.
Input schema: "Notify Only Changes" (was "Notify Only New/Updated") and description now mentions EXPIRED.

Test coverage

tests/transformAlgolia.test.ts — 4 tests covering GPSR manufacturer mapping, nutrition parsing, campaign flags, and the empty-GPSR null-fallback
tests/notificationAdapter.test.ts — 7 tests locking in OutputItem → NotificationItem mapping (title from name+subtitle, productId fallback, applyUrl=url, ingredients-preferred-over-description, category fallback, changeType passthrough, salaryMax=null)
Total: 145 tests (was 134)

Refactor

Extracted toNotificationItem as a named export from src/main.ts (was an inline closure in dispatchNotifications).

0.4.1 — GPSR / product safety

manufacturer — full EU GPSR (General Product Safety Regulation) responsible-party block on every Algolia-sourced product: {name, street, postalCode, city, countryCode, email, website, securityAlert}. Required by EU regulation since 13 Dec 2024 for many product categories (cosmetics, electricals, toys, food contact). Null when REMA hasn't filed compliance data for the SKU.
warnings[] — array of REMA-filed warnings on the product (separate from hazardStatements, which is the CLP regulatory wording).
itemDisclaimer — free-form supplier disclaimer text.
All three fields are populated in basket + search modes (Algolia); null in browse mode (REST).

0.4.0 — Ingredients, nutrition, barcodes + audit-2 fixes

New fields (basket + search modes)

Per-product detail is now fetched from the Algolia index, which exposes ~30 fields the public REST endpoint does not. The following are added to every record in basket and search modes (null in browse mode, where REST pagination remains for efficiency):

ingredients — full declaration text (e.g. "HVEDEMEL, vand, sukker …") with allergens pre-stripped of Algolia's <b> markup
nutrition — structured array of {name, value, sort} (energy, fat, protein, salt, …)
barcodes — EAN/UPC array for inventory matching
countryOfOrigin — ISO country code
categoryId / categoryName — finer than department (e.g. "Mælk m.v." under department "Mejeri")

Audit-2 fixes

EXPIRED records now reach notifications. Synthetic EXPIRED records are pushed to both toPush AND pushedItems, so the notification dispatcher actually sees them.
Notification adapter. dispatchNotifications now maps grocery OutputItem (name, url, price, currency, departmentName, ingredients) → NotificationItem (title, applyUrl, salaryMin/Max/Currency, location, description, category). Previously Telegram/Slack/Discord/WhatsApp would render "(untitled)" with no clickable link.
Coverage proof uses fetchCap, not raw maxResults. In incremental mode fetchCap is infinity so EXPIRED detection is no longer blocked by collecting more items than maxResults. Earlier behaviour silently disabled EXPIRED on any browse/search run with >200 hits.
Search-mode truncation honoured. searchProductIdsWithMeta() returns {ids, truncated, totalHits}; coverage is marked incomplete when Algolia reports more matches than fetched, preventing false EXPIRED classifications on broad queries.
EXPIRED records emit all fields. Synthetic records now include every OutputItem field (with null defaults) per the README "all fields always present" contract — departmentId, priceHistory, images, labels, nutrition, etc.
EXPIRED exempt from maxResults cap. Lifecycle events are rare and high-value — silently truncating them by the same cap that limits NEW/UPDATED would defeat the user's opt-in.
Actor.fail on lost lock → graceful exit + RMA-0050 typed error. No more crash-and-retry on the rare concurrent-run takeover case.
Bogus-productUrls guardrail. productUrls that don't parse to a single ID now exit with RMA-0060 instead of falling through to a full-catalog scrape (cost-protection).

Versioning

Synced package version and CHANGELOG so release tracking is unambiguous.

0.3.4 — Basket mode requires explicit stateKey

Basket mode + Incremental Mode now hard-requires stateKey. Previously omitting stateKey would auto-generate one — but after the v0.3.3 fix, that auto-key collapsed every basket run into a single storage slot. Two users tracking different baskets without setting stateKey would silently corrupt each other's "active set" and EXPIRED signal. The actor now exits with a clear error (RMA-0040) when basket mode + incrementalMode is used without an explicit stateKey, suggesting "my-rema-basket" as a starter name. Search / Browse mode are unaffected — they still auto-generate stateKey from their natural scope dimensions.
Input schema descriptions for productIds and stateKey updated to reflect the requirement.

0.3.3 — Stable basket state across content changes

Incremental scope is now mode-specific. Previously, adding or removing a single basket item silently created a fresh state slot and wiped all incremental history. Scope fields are now mode-aware: [campaignsOnly] for basket, [query, campaignsOnly] for search, [departmentIds, campaignsOnly] for browse. A user maintaining a basket can now add or remove items without losing the price + delisting history for the products they kept.
Auto-generated stateKey in basket mode no longer keys off productIds either — same rationale.
New tests lock the invariants: basket [a] and [a,b,c] produce identical state scope; query="mælk" and query="ost" produce different ones; departmentIds: [20] and [30] produce different ones.

0.3.2 — Basket-delisting alerts

Basket monitor example now sets emitExpired: true so users who copy the "Try it" template actually receive delisting alerts in their dataset.
selectItemsToNotify now includes EXPIRED in the change set. Previously when notifyOnlyChanges: true + incrementalMode: true was on, EXPIRED records reached the dataset but never reached Telegram / Slack / Discord / WhatsApp / webhook. For basket / inventory monitoring "this product is gone" is one of the most important alerts to fire — silent dropping was a bug.

0.3.1 — Fix: EXPIRED detection actually emits records

Audit-driven fixes for incremental mode.

EXPIRED records now emit. The prior implementation classified expired products correctly but the emit loop only iterated over current items, so EXPIRED records never reached the dataset. Pass 2 now synthesizes a minimal record (productId, url, source, contentHash, lifecycle fields, changeType: "EXPIRED") for every classified expiry. Gated by emitExpired: true.
Coverage proof is now wired. findExpiredJobs() short-circuits on incomplete coverage, so EXPIRED detection was permanently disabled. Coverage is now marked complete when the fetch wasn't truncated by maxResults — true for browse without a cap, search up to its hit count, and basket mode (every requested ID was attempted).
Basket 404 → EXPIRED in incremental mode. Missing basket IDs that exist in prior state are now caught by findExpiredJobs() and emitted as EXPIRED records (previously only logged).
firstSeenAt / lastSeenAt / expiredAt now merged into output. README claimed these fields existed on incremental records; they are now actually populated from the classification record.
Footer pricing fixed. Incremental run footer logged $0.0005/result despite actor pricing being $0.004/result — corrected.
Doc cleanup. INPUT_COVERAGE.md previously listed query as SKIP after v0.3 implemented keyword search; now marked IMPLEMENTED. README query examples no longer show the generator's "software engineer" default (added prefill: "mælk" to the schema).

0.3.0 — Input aliases, graceful errors, diagnostics

Input aliases. Common synonyms now resolve to canonical fields: q / search / searchKey / keyword / keywords → query; ids / id → productIds; url / urls → productUrls; category / categories → departmentIds; limit / max → maxResults; tilbud / onSale → campaignsOnly. Case- and punctuation-insensitive. Array aliases merge with their canonical (e.g. productIds + ids are concatenated, not overwritten). Unknown keys log a warning instead of failing the run.
Graceful error handling. Failures that previously called Actor.fail now log a warning and exit with 0 results: department listing failure, no-matching-departments, search-API failure. State-lock conflict already exited gracefully. Every graceful exit emits a typed error event for ops triage.
Run footer. Non-incremental runs now log the same cost/footer summary that incremental mode already did (price-per-result, emitted count, dataset URL).
Diagnostics. run.start (sanitized input + alias telemetry), run.complete (emitted count + classification), and typed errors (RMA-0010 departments-fetch-failed, RMA-0011 department-filter-no-match, RMA-0020 search-failed, RMA-0030 state-lock-conflict, RMA-0999 unexpected-failure) are posted to the org diag sink when OPS_INGEST_URL + OPS_SECRET are configured. Sink failures never bubble. No-ops silently in local development.

0.2.0 — Basket tracking + keyword search

Basket monitor — new productIds and productUrls inputs. Paste a weekly shopping list (IDs or browser URLs); combine with Incremental Mode + notifyOnlyChanges for weekly tilbud alerts at ~$0.09/run for a 20-item basket.
Keyword search — new query input ("mælk", "økologisk", …) searches the full catalog and returns matching products with the same output schema as browse/basket modes.
404 on a basket product now classifies as EXPIRED in incremental mode — surfaces delisted SKUs automatically.
Source label source field changed from shop.rema1000.dk to rema1000.dk (the brand domain). Product url fields still point at the working storefront subdomain.
New "Basket monitor" + "Search Oksekød" examples in the actor input.

0.1.0 — Initial release

Scrape Danish grocery products and prices from shop.rema1000.dk
15 departments with optional departmentIds filter (Brød, Frugt & grønt, Kød/fisk/fjerkræ, Køl, Frost, Mejeri, Ost, Kolonial, Drikkevarer, Husholdning, Baby, Pleje, Slik, Kiosk, Nemt & hurtigt)
campaignsOnly toggle to surface only tilbud / advertised items
Full price detail: price, campaign / advertised flags, validity window, deposit (pant), compare-unit price (per kg / ltr / stk), maxQuantity per-customer caps
Product metadata: name, size + brand subtitle, labels (organic / REMA-brand / sugar-free / etc.), images (small / medium / large), temperature zone, weight-item / batch-item / self-scale flags, hazard precaution statements
Compact mode for AI-agent workflows (shopper-essentials only)
Incremental mode with NEW / UPDATED / UNCHANGED / EXPIRED classification keyed on stable productId
Notifications: Telegram, Discord, Slack, WhatsApp Cloud API, generic webhook
Output schema with two dataset views: Overview + Campaigns

Grocery Price Tracker

janbruinier/jan-grocery-price-tracker

Track and compare grocery prices across stores. Monitor food prices, detect deals, and export structured price data for analysis.

Jan Bruinier

Instacart Grocery Price Index

shahidirfan/Instacart-Grocery-Price-Index

Track real-time grocery price trends across Instacart. Monitor food costs, analyze pricing patterns, and gain insights into market fluctuations. Perfect for price analysis and competitive research.

Shahid Irfan

5.0

Tesco Grocery Scraper

jupri/tesco-grocery

💫 Scrape Tesco.com Grocery

cat

259

5.0

Aldi UK Grocery Scraper

illehius/aldi-uk-scraper

Scrape Aldi UK product listings by search query. Returns name, brand, price, unit price, size, offer status, and product URL. Supports multiple queries per run with configurable result limits. Ideal for price monitoring, basket comparison, and grocery market research.

Siddhant

Kroger Grocery Scraper

fortuitous_pirate/kroger-scraper

Kroger Grocery Scraper. Structured data export for lead generation, enrichment, and competitive research.

Fortuitous Pirate

H Mart Grocery Store Scraper - Asian grocery products

fortuitous_pirate/hmart-scraper

Scrape H Mart Asian grocery store for product names, prices, categories, and availability. Extract data from Americas largest Asian supermarket chain for price monitoring, product research, and inventory tracking.

Fortuitous Pirate

Aldi Product Search Scraper

stealth_mode/aldi-product-search-scraper

Efficiently scrape product listings from Aldi.com.au, Australia's popular discount supermarket chain. Extract comprehensive data including SKUs, prices, availability, categories, and product specifications. Perfect for price monitoring, competitive analysis, and grocery market research.

Stealth mode

Instacart Grocery Scraper

albertfj1114/instacart-grocery-scraper

Scrape grocery products, prices, and availability from any Instacart store. Get accurate local pricing by zip code for Publix, Costco, Sprouts, Food Lion, CVS, Compare Foods, and more.

Albert J

Tesco Product Details Scraper

ecomscrape/tesco-product-details-scraper

Powerful Tesco.com product details scraper for extracting comprehensive grocery data including prices, promotions, product information, and availability. Perfect for market research, competitive analysis, and inventory management in the UK grocery sector.

ecomscrape

Swiss Grocery Scraper

nwichter/swiss-grocery-scraper

Scrapes weekly offers from Swiss grocery retailers (Aldi, Migros, Coop, Denner, Lidl). Uses Crawlee + Docling for web and PDF extraction. Outputs structured product data with prices, discounts, and categories.

Niklas Wichter

REMA 1000 Scraper - Danish Grocery Products & Prices

Changelog

0.6.7 — Pack-size shrinkflation now classifies as UPDATED

0.6.6 — README + schemas catch-up on the 0.6.x additions

0.6.5 — Ralph-loop hardening on the 0.6.x additions (3 rounds)

Round 1

Round 2

Round 3

0.6.4 — Audit metadata cleanup

0.6.3 — Compact mode promoted + documented

0.6.2 — Discount percentage

0.6.1 — Multi-ingredient AND, sort order, ingredient search promoted

Multi-ingredient filter

Sort order

Ingredient search promoted

Description cleanup

0.6.0 — Ingredient filter + 50% throughput boost

Ingredient filter

Concurrency

0.5.8 — README + Store example: 10-product basket with quantities

0.5.7 — Ralph-loop hardening (3 rounds, even LOW issues)

Round 1

Round 2

Round 3

0.5.6 — audit-7 basket summary hardening

0.5.5 — quantities also accepts array-of-pairs form

0.5.4 — Codex audit-6 fixes

Per-row line economics on the dataset

Locked invariants

Validation logs

Backwards compatibility

0.5.3 — quantities is now an actor input

0.5.2 — Basket summary: per-product quantities + max-cap rules + clearer counts

Quantities

Max-quantity (over-cap) rules

Clearer counts

0.5.1 — Basket summary now reports both subtotal and total

0.5.0 — Basket totals + savings + pant rollup

0.4.9 — Fix: labels populated on basket + search products

0.4.8 — Critical fix: basket/search compare-price now matches the shelf

Hardening

Compatibility

0.4.7 — Audit-5 fixes (lock hygiene + mode-conflict surfacing)

Hardening

UX

0.4.6 — Audit-4 fixes

Hardening

Compact mode

0.4.5 — Audit pass: unit/brand fields wired everywhere

0.4.4 — Structured unit size + brand

0.4.3 — README + evidence catch-up

0.4.2 — Audit-3 fixes (false-EXPIRED hardening + test coverage)

Hardening

Test coverage

Refactor

0.4.1 — GPSR / product safety

0.4.0 — Ingredients, nutrition, barcodes + audit-2 fixes

New fields (basket + search modes)

Audit-2 fixes

Versioning

0.3.4 — Basket mode requires explicit stateKey

0.3.3 — Stable basket state across content changes

0.3.2 — Basket-delisting alerts

0.3.1 — Fix: EXPIRED detection actually emits records

0.3.0 — Input aliases, graceful errors, diagnostics

0.2.0 — Basket tracking + keyword search

0.1.0 — Initial release

You might also like

Grocery Price Tracker

Instacart Grocery Price Index

Tesco Grocery Scraper

Aldi UK Grocery Scraper

Kroger Grocery Scraper

H Mart Grocery Store Scraper - Asian grocery products

Aldi Product Search Scraper

Instacart Grocery Scraper

Tesco Product Details Scraper

Swiss Grocery Scraper

Changelog

0.6.7 — Pack-size shrinkflation now classifies as UPDATED

0.5.5 — `quantities` also accepts array-of-pairs form

0.5.3 — `quantities` is now an actor input

0.5.5 — `quantities` also accepts array-of-pairs form

0.5.3 — `quantities` is now an actor input