Extract insurance, banking & finance intermediaries from the French ORIAS register. Input: SIREN numbers or ORIAS URLs. Output: names, emails, phones, addresses (parsed), ORIAS numbers, associations. Ready for lead generation, CRM enrichment, and compliance checks. Export as JSON, CSV, or Excel.
Resurrection support: Switched back to RequestQueue — when the Actor times out, you can use Resurrect in Apify Console to continue where it left off (no re-discovery, no duplicate work)
README: New "Timeout & Resurrection" section for large crawls (30k+ URLs)
Changed
RequestQueue: Replaced RequestList with RequestQueue for persistent crawl state across run restarts
Default timeout: Increased from 6h to 8h (28800s) in actor.json for full-register runs
[0.2.0] - 2026-02-26
Added
Category mode: New mode: "categories" input that automatically discovers and scrapes all intermediaries in selected categories — no need to provide SIRENs manually
CSV-based discovery: For COA, CIF, and COBSP, the Actor downloads the official ORIAS CSV export (dynamically resolved URL) and extracts all active SIRENs — up to 27,000+ per category, resolved in seconds
Paginated search discovery: For AGA, MA, MAL, MIA, MOBSP, MOBSPL, and MIOBSP, the Actor uses the ORIAS advanced search with session-based pagination (POST /home/resultAdvancedSearch → GET ?targetPage=N) to enumerate all SIRENs
categories input field: Array of category codes to enumerate (e.g. ["COA", "MIA", "COBSP"]). Supports all 11 registered categories across IAS, IOB, and Finance
mode input field: "urls" (default, existing behavior) or "categories" (new bulk discovery mode)
discoverer.js module: Standalone discovery module — CSV download + paginated search strategies with automatic deduplication across categories
Increased timeout: Default run timeout raised to 3,600s to support full-register runs (40,000+ intermediaries)
Changed
main.js: Refactored to handle two-phase execution in category mode (discovery → scraping)
input_schema.json: New fields mode, categories; startUrls is now optional when using category mode
actor.json: Version bumped to 0.2, description updated, timeout increased
[0.1.7] - 2025-02-25
Added
Website from link: Extract "Site internet" from <a href> when ORIAS displays it as a link instead of span
Website from email domain: Fallback to derived URL when site is "non renseigné" — e.g. contact@smatis.fr → https://smatis.fr (excludes free providers: gmail, laposte, orange, etc.)
Fixed
Invalid URL error: Normalize website URLs without protocol — www.groupegescoassurances.fr → https://www.groupegescoassurances.fr to avoid "Failed to construct 'URL': Invalid URL"
[0.1.6] - 2025-02-25
Added
Output schema: Dataset results displayed in Actor Output tab with Overview and Contacts views
Address parsing: Structured fields — address, city, zipcode, country — from raw ORIAS addresses
addressFull: Reconstructed full address for display and mailing
phoneE164: French phone numbers in international E.164 format
Association URLs: Direct ORIAS profile links for each associated company
Progress logging: Real-time progress in Log tab (scraped/total, %, ETA)
CHANGELOG: Version history for the Store page
Changed
RequestList: Switched from RequestQueue to RequestList for zero queue writes and faster startup
City cleaning: Removed "FRA" suffix and zipcode in parentheses (e.g. "Paris FRA" → "Paris")