Export properties, wholesalers, pricing, locations, and deal data from InvestorLift. Get a clean dataset ready for CRM, outreach, comps, or deal sourcing — in one run. Turn InvestorLift into a structured deal pipeline.
One run, one dataset, thousands of deals.
Property Detail API: When enriching, uses /api/properties/{id} first for structured JSON (account_id, account.title, condition, description, main_image). HTML parsing as fallback if API returns 403/404
Seller rating: wholesaler_rating and wholesaler_review_count from /api/account-stats/{accountId}/rating. Fetched when account_id is available (from API or page). Cached per account to avoid duplicate requests
Wholesaler from API: apiRowToItem now uses API columns (account_title, wholesaler_company, wholesaler_name, account_id) when the properties list API returns them
account_id: Output field for seller account ID. Extracted from Nuxt deal data, profile-photos URL, or API
enrichWithDetails in input schema: Checkbox visible in Apify Console. Required for wholesaler columns to be populated
Changed
Full details scope: Now enriches Historical and All modes (Phase 2), not just Active. Wholesaler, description, lot_size, condition populated for historical deals when enabled
fetchDealDetail: Tries Property Detail API first, fallback to HTML page fetch + Cheerio parsing
Fixed
Wholesaler columns empty: Root cause was (1) API columns ignored in parser, (2) Phase 2 never enriched. Both fixed
[1.3.0] - 2026-03-01
Added
Minimalist input UI: Step-based layout (What do you want? → Data quality → Proxy). Airbnb Pro Host style
Auto-detect range: Historical/All modes — start and end IDs derived from API. No manual Start/Stop IDs
256 workers for historical (default). Up to 384
Proxy sticky per worker: One proxy per worker for connection reuse
API timeout + retry: 30s timeout, 2 retries on API fetch
429 handling: Backoff on rate limit (5s, 10s) before retry
Changed
Puppeteer removed: Switched to HTTP (got-scraping) + Cheerio. Faster, cheaper
Checkpoint: Serialized writes (1 writer) to avoid Apify KVS 429. Every 2000 IDs
Full details: Clarified — only affects Active mode. Historical/Specific already fetch full pages
README: Aligned with minimalist UI, auto-detect range, resume steps
Input schema: Removed from UI — useMaxIdFromApi, historicalIdStart, historicalIdEnd, detailConcurrency, historicalConcurrency, excludeIds (still supported via API)
Fixed
Too many parallel set requests: Checkpoint writes serialized; single writer
Proxy pool init: Parallel newUrl() calls for faster startup
[1.2.0] - 2026-03-01
Added
CHANGELOG: Version history for the Store page
Modular architecture: Code split into api.js, parser.js, scraper.js, and utils.js for maintainability and testability
Changed
README: Rewritten in English with orias-scraper-style structure — input/output examples, use cases, local development, troubleshooting
Code organization: Main orchestration in main.js; API fetch logic in api.js; page parsing in parser.js; HTTP/browser scraping in scraper.js; date/checkpoint utilities in utils.js
Removed
Dead code: Unreachable HTML pagination branch (scrape mode is always active_only, historical, or date_range — all use API or historical loop)
Unused constants: MS_PER_DEAL, batchSize
Unused functions: fetchPage, extractDealLinks (legacy marketplace HTML pagination)
Unused input parsing: maxPages, startPage, currentPage for non-existent mode
[1.1.0] - 2026-02
Added
Enrichment mode: enrichWithDetails — visits each deal page with Puppeteer for description, condition, lot_size, wholesaler, img_url
API-first: Active and date_range modes use marketplace API for fast bulk fetch
Proxy-chain: Anonymized proxy for Puppeteer when using Apify residential proxy (fixes ERR_INVALID_AUTH)