Locale-aware price parser. Brazilian/European thousands separator (.) was being parsed as a decimal (R$2.800 → 2.8 instead of 2800). New parser handles US ($1,234.56), EU (€1.234,56), BRL (R$2.800), JPY (¥29,000), and multi-grouped (1.234.567) formats with 21/21 unit tests passing.
startUrls hard failure. When every user-provided startUrls entry is invalid or non-Craigslist, the actor now fails fast with an actionable status message instead of silently falling through to scrape the default city: 'newyork'.
0.2 — 2026-05-13
Greatly improved typed attribute extraction. Cars+trucks now parse year, make, model, make_model, odometer (numeric), fuel, transmission, drive, cylinders, title_status, body_type, vin, condition, paint_color. Housing now parses bedrooms, bathrooms, sqft (numeric, from 2BR / 1Ba - 700ft² block AND from JSON-LD Apartment schema), plus pets_allowed, smoking_allowed, cats_ok, dogs_ok, wheelchaccess, airconditioning, ev_charging, rent_period, housing_type, laundry, parking, available, furnished. Jobs continue to expose compensation, employment_type, job_title.
Address now prefers the richer JSON-LD Apartment.address (street, city, region, postal, country) over the bare .mapaddress text, and strips Craigslist's literal (google map) placeholder.
Geo falls back to JSON-LD latitude/longitude when the #map div is absent.
Multi-city auto-fanout via topUSMetros (top 25 US metros) and topGlobalMetros (top 50 worldwide) presets, or arbitrary cities list.
Cross-city repost deduplication via post-body fingerprint (SHA-256 over normalised title + body + price + first image hostname).
Two-stage scraping: search-result cards (cheap listing-scraped event @ $0.005) and optional full post detail (listing-detailed event @ $0.008) for description, full image gallery, attributes, contact reply URL.
Optional public contact extraction (extractContacts) — emails/phones present in the post body or reply URL only; no logged-in scraping, no reply-form bypass.
Residential proxy + Crawlee session pool (max 30 uses/session, retire on 403/429), circuit breaker at >50% error rate over rolling 20 requests.
MCP-ready: 5-part tool description, 4-sentence input field descriptions, flat ≤500-token dataset items, _warnings and _summary sentinel records on partial success.