Scrape dba.dk — Denmark's largest classifieds platform. Covers both marketplace and vehicle listings in one actor with mode-specific output schemas and incremental change tracking.
accelerationSec (number) — parsed from 0-100 km/t / 0-100 km/h (e.g. "7,4 sek" → 7.4).
drivingRange (number, existing field) — now overlaid from the detail-page Rækkevidde spec when present. The SERP-supplied value is still used as fallback. Detail-page values are typically more reliable than the API's driving_range field for older or imported EVs.
Previously these required parsing vehicleSpecs["Batterikapacitet"] etc. as raw strings — typed fields make them sortable, filterable, and ML-ready.
Dataset schema "all"-view exposes the new fields. No input changes; no breaking changes to existing fields.
0.3.0 — 2026-05-12
Behavior change
omitNulls default flipped to true. Output rows now drop fields with null, empty string, empty array, or empty object values by default. Set omitNulls: false if you need a stable schema where every row has every documented key (the prior behaviour). Affects every row in every run.
Mobility (vehicle) detail page
New parser reads the <dt>/<dd> spec table, description, postnummer+by, "Sælgers kendskab" Q&A, synsrapport link, "Opdateret" timestamp, and dealer infokort (name, CVR, dealer-page URL, years on DBA, total ads, street address).
New typed fields from the spec table: bodyType, color, seats, doors, previousOwners, power (HP), engineSize (L), co2Emission (g/km), fuelConsumption (km/L), maxTowingWeight (kg), drivetrain, weight (kg), country, firstRegistrationDate, lastInspectionDate, nextInspectionDate, interiorColor, vehicleCategory, vehicleCondition, vehicleType.
color from colour/color (now shared with mobility)
material matches *_material (bags_purses_material, jewellery_main_material)
gender matches gender or *_gender
theme, designer, age, season, platform (gaming via games_group)
depth, height, width, length (cm — furniture)
Shared / cross-vertical
zipCode now appears on mobility output (previously stripped — bug).
coordinateAccuracy populated from SERP coords on both subverticals.
mapImageUrl extracted from recommerce detail JSON.
SERP filter
Added fiksFerdig input flag — maps to shipping_exists=true URL param. Restricts marketplace SERP to listings with DBA Fiks færdig (transactable shipping via DBA Pay). ~62% of recommerce listings qualify.
Internal
Mixed-mode runs (paste-mode with both marketplace and mobility URLs) now strip per-row by config-of-origin instead of leaving synthetic false defaults from the wrong subvertical on each row.
Placering parser scoped to its section instead of full body text to avoid false captures from description content.
inspectionReportUrl resolved as absolute URL.
New parseLengthCm helper handles Danish comma decimals, English period decimals, m → cm conversion, and bare numbers.
26 new unit tests in tests/transform.test.ts covering extras matching, strip-set integrity, and length parsing.
0.2.1 — 2026-05-11
Fixed: Marketplace priceMin/priceMax filters now actually narrow results to the requested price range. Previously they were accepted but ignored.
Fixed: Marketplace sellerType (private / dealer) now actually filters to the requested seller type. Previously it was ignored.
Strengthened: Canary tests now compare every filtered result count against the unfiltered baseline, so a future regression where a filter is silently ignored fails the test instead of passing on "any docs returned".
0.2.0 — 2026-05-11
Fixed: Vehicle range filters (yearFrom/yearTo, priceMin/priceMax, mileageFrom/mileageTo) now return results reliably. Previously these combinations silently returned zero rows.
Fixed: Vehicle categorical filters (fuel, transmission, dealerSegment) now actually filter results. They previously had no effect on the returned dataset.
Improved: fuel and transmission are now dropdowns with the full set of supported values (Plug-in Benzin, Plug-in Diesel, Brint, etc.).
Added: Live canary test suite — verifies each filter actually applies on every release.
0.1.x — 2026-04-14
Added: descriptionHtml, descriptionMarkdown output fields (triple-format descriptions for RAG/LLM pipelines)
Added: contentHash output field (stable hash over content-identifying fields, used for change detection)
[0.1] — 2026-03-22
Initial release
SERP scraping via DBA.dk /recommerce/forsale/search endpoint
JSON-LD structured data extraction (CollectionPage.ItemList + Product schema)
35 output fields per listing: title, price, condition, region, GPS coordinates, up to 10 images, seller info, category path, shipping/buy-now flags
Detail enrichment: full description, nested category tree, zip code, product attributes
Filters: category, condition, region, price range (min/max), seller type, sort order
Pagination with configurable maxResults (0 = unlimited, max 5000)
Incremental mode: track new/changed listings across scheduled runs via KV state
Compact output mode for AI-agent and MCP workflows
descriptionMaxLength truncation
PAY_PER_EVENT pricing: actor-start + result events