Scrape Sephora products across 20 storefronts (US, Canada, 9 EU markets, 10 APAC countries) through one unified Python actor. Extract prices, variants, ratings, and catalog details via official mobile APIs with TLS fingerprint impersonation, OAuth2/guest-token auth, and per-market session isolation.
Unified Crawlee HttpCrawler across US + EU + SEA with synthetic-URL masking, global circuit breaker, per-market pre-flight, declarative retry policy, serialized EU guest-token refresh, EU product-ID case normalization, maxProducts cap honored in EU pagination; CZ, GR, and PPE instrumentation removed, PT now served via the Spain backend with pt-PT locale.
NEWvariants[].wishlisted boolean field (SEA-only; analog of US lovesCount).
Feat:
Multi-market support — US + Canada + 9 EU countries + 10 APAC countries in one actor.
Auto-detect market from startUrls hostname; no input changes required for existing US users.
Optional market and locale override inputs for edge cases (bare IDs, region mismatches).
Per-market observability counters persisted to the default KV store as run-summary.
Architecture:
Refactored src/main.py into a lightweight dispatcher. Per-region logic lives in src/markets/{us,eu,sea}.py.
EU and SEA ports use curl_cffi for upstream compatibility. US keeps Crawlee's HttpCrawler with the existing session-pool auth flow unchanged.
Strict market-module isolation — per-region auth state cannot cross-contaminate.
Compatibility:
Existing US run configs work unchanged. Pre-v2.0 output fields are preserved; only the additive market key is new.
Standalone autofacts/sephora-eu-scraper and autofacts/sephora-nz-scraper listings remain live but are pinned to their current version. New bug fixes and features land in this repo only.
Deps:
Added: curl_cffi[async] >= 0.7, lxml >= 5.0.
10/03/2026
Feat: add sentiments field to product output, merging sentiment summary and sentiment items