omitNulls default is true. Output rows now drop fields with null, empty string, empty array, or empty object values by default. Set omitNulls: false if you need a stable schema where every row has every documented key.
Generic-suffix matching (validated on the DBA-equivalent Schibsted Aurora platform across 1320 listings × 320 sub-cats) promotes category-prefixed extras to typed columns:
brand matches *_brand (lego_brand, shoes_brand, mobile_brand, …)
model matches *_model
size matches *_size (shoe_size, women_clothing_size, mobile_memory_size, …)
subType matches *_type excluding ad_type (shoe_type, hifiparts_type, women_clothing_dress_type, …)
color from colour / color extras
material matches *_material (bags_purses_material, jewellery_main_material)
gender matches gender or *_gender
theme matches *_theme (e.g. lego_theme)
designer for art/furniture listings
age matches *_age (pet age)
season for seasonal items (Påske, Jul, …)
platform for gaming (games_group, *_platform)
depth, height, width, length (cm — furniture)
coordinateAccuracy populated from SERP coordinates.accuracy.
mapImageUrl extracted from recommerce detail JSON (itemData.location.position.mapImage).
- Added
fiksFerdig input flag — maps to shipping_exists=true URL param. Restricts SERP to listings with Fiks ferdig (transactable shipping). ~86% of recommerce listings qualify (verified live).
- 15 new unit tests in
tests/transform.test.ts covering extras matching, length parsing, and omitNulls semantics.
- Fixed:
priceMin / priceMax filters now actually narrow results to the requested price range. Previously they were accepted but ignored, and runs returned the full unfiltered list.
- Added: Live canary test suite — verifies each filter actually narrows results against an unfiltered baseline on every release.
- Added:
descriptionHtml, descriptionMarkdown output fields (triple-format descriptions for RAG/LLM pipelines)
- Added:
contentHash output field (stable hash over content-identifying fields, used for change detection)
- Added: cross-run repost detection (
isRepost, repostOfId, repostDetectedAt)
- Added:
skipReposts input to exclude detected reposts from output
All notable changes to FINN.no Torget Scraper are documented here.
- Initial release: direct HTTP fetch + React Query hydration extraction from FINN.no Torget
- 35 structured output fields per listing (title, price, condition, brand, model, category path, GPS coordinates, images, seller info, shipping, buy-now, attributes)
- SERP pagination (53 listings/page) with
maxResults cap
- Detail enrichment (
includeDetails): description, full category path, zip code, attributes
- Compact output mode for AI-agent/MCP workflows with
descriptionMaxLength truncation
- Incremental mode: track new/changed listings across runs with scoped
stateKey
- Filters: category (11 categories), condition, region (16 Norwegian counties), price range, seller type, sort order
- PAY_PER_EVENT pricing:
actor-start ($0.01) + result ($0.002) events
- No proxy, no external services — direct fetch only
- 97 integration tests passing (vitest)