Extract official French legal announcements: company registrations, insolvency proceedings, business sales, and account filings. Filter by date range, department, or SIREN. No API key. Ideal for daily lead gen, risk monitoring, and compliance
All notable changes to this Actor are documented in this file.
The format follows Keep a Changelog . Actor versions use MAJOR.MINOR (Apify convention).
[1.2] - 2026-05-11
Added
Google Maps enrichment (enrichWithMaps) — optional two-phase pipeline: Phase 1 fetches all BODACC records via HTTP; Phase 2 launches Puppeteer workers to search Google Maps for each announcement and merges the business profile when a confident match is found. Added fields: maps_found, maps_name, maps_phone, maps_website, maps_category, maps_address, maps_rating, maps_reviews, maps_hours, maps_url, maps_score, maps_search_query.
mapsWorkers input — number of parallel browser tabs (1–10, default 3). Each tab runs an independent Maps search; cap to avoid OOM.
mapsMinScore input — minimum match confidence score (0–10, default 7) to accept a Maps result. Scores below threshold are discarded.
openaiApiKey input — when provided, GPT-4o-mini validates ambiguous matches and returns a numeric confidence score (0–10). Without it, a fast regex-based name matcher is used (score 8 on match, 0 on miss).
trade_name extracted from BODACC listepersonnes (nomCommercial / nomUsage). Used as the primary search term for Maps enrichment.
Pay-per-event billing on Apify Cloud — Actor.chargeEvent('maps-enrichment') is called for each accepted Maps match.
Progress logging for Phase 2: every 25 records logs searched/total, match count, throughput (rec/s), and ETA.
src/lib/mapsEnricher.js — self-contained Puppeteer module with PagePool (worker pool), namesMatch (token-based fuzzy matching), buildSearchName (trade name or first+last name), and validateWithLLM (GPT-4o-mini scoring with regex fallback). All logic is isolated for unit-testability.
Changed
defaultRunOptions.memoryMbytes raised from 256 to 1024 — required when enrichWithMaps: true (each Puppeteer tab uses ~200MB).
defaultRunOptions.timeout raised from 3600s to 7200s to accommodate large Maps enrichment runs.
maxMemoryMbytes raised from 1024 to 4096.
actor.jsonversion updated to 1.2.
actor.jsongeneratedBy updated to Cursor with Claude.
Run logs: removed emoji from Maps match lines (✓ → plain Match: label) to conform to premium/Store log style.
[1.1] - 2026-05-10
Added
Search URL mode — paste any BODACC URL directly (searchUrl input field); the actor extracts keyword, types, date range, and departments automatically. Manual fields override URL values when both are set.
Keyword search (keyword) — full-text filter passed directly to the API (e.g. peinture, plomberie). Matches company names, activity descriptions, and free-text fields.
Multi-keyword search (keywords list) — one keyword per line, combined with OR logic (e.g. peinture OR peintre OR paysagiste). Takes priority over the single keyword field.
creation type added to announcement types — the API's primary value for new registrations (distinct from immatriculation). Default types updated from ["immatriculation", "collective"] to ["creation", "collective"].
modification, radiation, dpc, divers added to available announcement types with correct labels.
output.csv — results are now written progressively to output.csv at the project root during local runs (same behaviour as the other actors in this repo).
csvWriter.js — dedicated CSV writer with BODACC-specific columns: announcement_id, date, announcement_type, company, siren, person_type, city, postal_code, department, region, activity, legal_form, capital_eur, management, registration_date, activity_start_date, judgment_type, judgment_date, url, etc.
Fixed
Switched from ODS v2.1 to v1 API (/api/records/1.0/search/) — the v2.1 API's q parameter was treated as a relevance score (not a strict filter), returning inflated total_count figures. The v1 API matches the website behaviour exactly (e.g. peinture + creation + 2026 → 1 877 results, not 203 506).
creation family now correctly triggers registration field extraction (legal_form, capital_eur, management, address, activity, etc.) — previously only immatriculation and vente did.
[1.0] - 2026-04-28
Added
Initial release — scrape BODACC legal announcements from the official ODS v2.1 API (bodacc.fr/api/explore/v2.1).
Two modes: dateRange (filter by date, types, departments) and sirens (full announcement history per SIREN).
Five announcement types: registrations (immatriculation), insolvency (collective), business sales (vente), accounts filings (depot), other notices (avis).
Smart defaults: dateFrom / dateTo default to yesterday → today when omitted — ready for daily scheduled runs out of the box.
Resilient HTTP layer: exponential backoff on 429/5xx (up to 10 retries, 600ms inter-page delay + jitter).
Per-SIREN try/catch in SIREN mode: a failing company does not abort the rest of the run.
transformAnnouncement pure function: parses nested JSON strings from the API (listepersonnes, acte, jugement, listeetablissements) into flat, clean records with empty fields omitted.
buildWhereClause pure helper: constructs ODS SQL where clauses from filters — unit-testable without network.