French Companies · Search & SIREN Enrich
Pricing
from $3.99 / 1,000 company retrieveds
French Companies · Search & SIREN Enrich
Paste a Pappers or data.gouv URL or SIREN list — export-ready rows: directors, VAT, addresses, NAF, finances where published. Public INSEE register. No API key.
Pricing
from $3.99 / 1,000 company retrieveds
Rating
0.0
(0)
Developer
Corentin Robert
Maintained by CommunityActor stats
0
Bookmarked
65
Total users
11
Monthly active users
5 days ago
Last modified
Categories
Share
French Companies · Search & SIREN Enrich · No API Key
Get export-ready French company data in minutes: build prospect lists from a search URL or turn SIRENs into full profiles — names, locations, activities, directors, financials where published, legal form, VAT, and more. Data comes from France’s official register (recherche-entreprises.api.gouv.fr, INSEE-backed).
No login. No API key. No Pappers account required.
Need every branch and office, not just the headquarters? Use the French Establishments Scraper — paste the SIRENs from this Actor and get one row per site (SIRET), with address, NAF, headcount, and status.
Prospecting regulated professions? See the French Accountants Directory Scraper (expert-comptables.org) and Notaires.fr Directory Scraper — same HTTP-first CRM export pattern.
Who is this for?
| You are… | Typical goal | Suggested setup |
|---|---|---|
| B2B sales / SDR team | Prospect list by sector, size, or region | Search URL + exclude GE/ETI + director filters |
| Outbound agency | Enrich a CRM spreadsheet of SIRENs | Mode sirens → export CRM export view |
| Local franchise / retail ops | Map competitors in a city or postcode | Mode searchUrl — Pappers link with ville= |
| Compliance / KYB analyst | Verify legal form, status, directors | URL or SIRENs — turn director filters off |
| New-co hunter | Fresh SAS in your sector | Search URL + Created from (last 12 months) |
| Market research | Sector snapshot with director demographics | Search URL + dirigeantAgeMin / Max + optional excludeCategories |
| Data / RevOps | Scheduled master company table | Search URL with maxResults: 0; raise run timeout for large jobs |
| AI / automation builder | Feed an LLM or MCP agent with structured company rows | SIREN list or Search URL → Agent export view or full JSON |
What you get by default: one row per company (SIREN) with identity, address, NAF activity, legal form, VAT, director names where published, and financial bands when the register exposes them. Director email and phone are not in the public register — enrich downstream (e.g. website crawl, FullEnrich).
When to tighten filters: set requireDirigeant: true and requireDirigeantPhysique: true for outreach-ready lists; turn them off for raw compliance extracts. Set excludeCategories: ["GE","ETI"] when you want PME-only prospecting.
Quick start
Search URL (most common)
- Run a search on Pappers, Societe.com, or data.gouv and copy the full URL from the address bar.
- Open this Actor → mode Search URL → paste into Search URLs.
- Optional: tighten filters (exclude GE/ETI, creation dates, director rules) — see Optional filters below.
- Set Max companies (
25for a preview,0for full export). - Click Start → export from the Dataset tab (JSON, CSV, Excel).
SIREN list (enrich IDs you already have)
- Mode SIREN list → paste one SIREN per line (8–9 digits; short values are zero-padded).
- Pass only the fields you need — omitted fields are not stored in your run input; code defaults apply when absent.
- Click Start.
{"mode": "sirens","sirens": ["63200885", "69200210", "303252225"],"requireDirigeant": true,"requireDirigeantPhysique": false}
What it extracts
| Category | Fields (representative) |
|---|---|
| Identity | siren, nom_complet, nom_raison_sociale, date_creation, etat_administratif |
| Activity | activite_principale (NAF), libelle_activite_principale, categorie_entreprise |
| Address | adresse, code_postal, ville, departement, departement_nom, region, region_nom, latitude, longitude |
| Legal & tax | forme_juridique, nature_juridique, tva_intracommunautaire, siret_siege |
| Scale | effectif_salarie (band), caractere_employeur, nombre_etablissements, nombre_etablissements_ouverts |
| Governance | dirigeants, dirigeant_1…dirigeant_5, role_1…role_5, birth year and nationality where published |
| Outreach helpers | dirigeant_1_age, company_age_years, statut_entreprise, pappers_url, data_gouv_url, google_maps_url, linkedin_search_url |
| BODACC signal (optional) | derniere_annonce_date, derniere_annonce_type, derniere_annonce_libelle, derniere_annonce_url — see includeLastBodaccEvent |
| Financials | chiffre_affaires, resultat_net, annee_finances (when published in the source) |
| Flags | est_entrepreneur_individuel, est_organisme_formation, donnees_diffusibles, convention_collective |
Typical field coverage
Measured on Apify production run HqGPGUFVIsjqahMMR (5 170 SIRENs enriched · 2026-06-05). Your mix will vary by sector and register diffusion rules.
| Field | Coverage | Notes |
|---|---|---|
siren, nom_complet, etat_administratif | 100% | Core identity |
adresse, code_postal, ville | 100% | HQ address on this run |
activite_principale + label | 100% | NAF / APE |
tva_intracommunautaire | 100% | VAT number |
forme_juridique | 99.4% | Legal form label |
dirigeant_1 (primary director) | 100% | This cohort had a named director on every row |
dirigeant_1_date_naissance (public year) | 90.5% | ~9.5% redacted or missing — use requireDirigeantPhysique to drop masked rows |
chiffre_affaires, resultat_net | 0% on this run | Finances depend on publication in the register; common on Search URL exports when available (~30–50% in mixed sectors) |
| Email / phone | 0% | Not in this register — use a separate enrichment step |
- Auditors (commissaires aux comptes) are excluded from director columns by design.
- Empty values are omitted from each row (no empty strings).
- Unmatched SIRENs are logged in CSV / RUN_LOG with
siren+_error(not written to the dataset — not billed).
How much does it cost to scrape French companies?
Pay per event — each successful row in the default dataset bills apify-default-dataset-item (Store label: Company retrieved). Failed or not-found SIRENs are not written to the dataset and are not billed (see CSV / RUN_LOG for _error rows). HTTP-only keeps compute low on top of row pricing.
| Scenario | Rows billed | FREE | Bronze | Silver | Gold |
|---|---|---|---|---|---|
| 25 companies (preview) | 25 | ~$0.15 | ~$0.12 | ~$0.11 | ~$0.10 |
| 1,000 SIRENs enriched | 1,000 | ~$5.99 | ~$4.99 | ~$4.49 | ~$3.99 |
| 5,200 SIRENs enriched | 5,200 | ~$31.15 | ~$25.95 | ~$23.35 | ~$20.75 |
10,000-row search (maxResults: 0) | 10,000 | ~$59.90 | ~$49.90 | ~$44.90 | ~$39.90 |
Higher Apify subscription tiers unlock lower per-row prices on the Store (Bronze → Gold discounts).
SIREN enrich speed (handled in code — nothing to configure):
| List size | Apify Cloud behaviour | Typical duration |
|---|---|---|
| < 50 SIRENs | Single IP · 3 workers · 3.5 req/s | Seconds to a few minutes |
| ≥ 50 SIRENs | Auto Apify Proxy · 10 workers · ~6 req/s per IP | ~11 min / 5 200 SIRENs (observed); ~5–15 min / 5k depending on API load |
Large lists scale through parallel proxy IPs instead of one shared cloud egress. Advanced tuning (enrichMaxConcurrency, enrichMaxRequestsPerSecond, enrichDelayMs, proxyConfiguration) is API-only — not shown in the Console form.
Silver and Gold Apify plans reduce the per-row price further (see table above). A small compute component still applies (512 MB default RAM). Raise timeout under Run options for maxResults: 0. See Apify pricing.
Input examples
Search URL — SAS IT & real estate, directors aged 35–45
{"mode": "searchUrl","searchUrls": ["https://www.pappers.fr/recherche?en_activite=true&forme_juridique=5710&activite=62.01Z,62.02A,62.02B,63.11Z","https://www.pappers.fr/recherche?en_activite=true&forme_juridique=5710&activite=68.10Z,68.20A,68.31Z"],"maxResults": 500,"requireDirigeant": true,"requireDirigeantPhysique": true,"dirigeantAgeMin": 35,"dirigeantAgeMax": 45,"dateCreationMin": "2021-01-01","excludeCategories": ["GE", "ETI"],"effectifsMin": 10,"effectifsMax": 49}
Search URL — SAS created in the last 12 months (full export)
{"mode": "searchUrl","searchUrls": ["https://www.pappers.fr/recherche?en_activite=true&forme_juridique=5710&date_creation_min=15-05-2025&date_creation_max=15-05-2026"],"maxResults": 0,"dateCreationMin": "2025-05-15","dateCreationMax": "2026-05-15"}
When a date filter is active, the Actor splits the query into 13 parallel regional shards to cover the full register without hitting the per-query cap.
Agent / MCP — enrich SIRENs for an LLM
Every row includes English alias fields (company_id, summary, …) plus the full French register. In the Console, open the Dataset tab → Agent export (LLM / MCP) view → Export. For API/MCP, read the dataset as JSON and pick the fields you need.
{"mode": "sirens","sirens": ["552032534", "380129866"],"requireDirigeant": true}
Each row returns English keys plus quick links. For long lists, start the run with waitSecs: 0 and poll get-actor-run → get-dataset-items.
Example row (Agent export view / English fields):
{"company_id": "732829320","company_name": "Example SAS","legal_name": "EXAMPLE","status": "Active","created_at": "2015-01-01","company_age_years": 11,"legal_form": "SAS","activity_code": "62.01Z","activity_label": "Computer programming activities","city": "Paris","director_name": "Jean Dupont","director_role": "President","director_age": 51,"summary": "Example SAS · Active · Computer programming activities · Paris · director: Jean Dupont (age 51) · 11 yrs old","pappers_url": "https://www.pappers.fr/entreprise/732829320","annuaire_url": "https://annuaire-entreprises.data.gouv.fr/entreprise/732829320","maps_url": "https://www.google.com/maps/search/?api=1&query=...","linkedin_search_url": "https://www.linkedin.com/search/results/all/?keywords=..."}
See Dataset export views below.
Common filter recipes (Console)
Use the optional filters directly — no hidden preset layer.
PME prospecting (outreach-ready)
{"mode": "searchUrl","searchUrls": ["https://www.pappers.fr/recherche?en_activite=true&activite=62.01Z&departement=75"],"excludeCategories": ["GE", "ETI"],"requireDirigeant": true,"requireDirigeantPhysique": true,"maxResults": 100}
New SAS — last 12 months
{"mode": "searchUrl","searchUrls": ["https://www.pappers.fr/recherche?forme_juridique=5710&..."],"dateCreationMin": "2025-06-12","maxResults": 100}
KYB / compliance — full register row
{"mode": "sirens","sirens": ["552032534"],"requireDirigeant": false,"requireDirigeantPhysique": false}
API-only: preset or presetMode (pme_prospection, new_sas_12m, kyb_extract) still pre-fills the same filters for scripted runs.
Optional BODACC last notice (includeLastBodaccEvent)
Adds the most recent official legal announcement per company from bodacc.fr (public gazette, no extra API key). One HTTP call per saved row.
| When to enable | When to leave off |
|---|---|
| Preview runs (25–100 rows) | maxResults: 0 or thousands of rows |
| SIREN lists < ~200 | Scheduled bulk exports |
| Timing signal for outbound (creation, insolvency, sale) | Use BODACC Announcements Scraper for full history |
{"mode": "sirens","sirens": ["995209905"],"includeLastBodaccEvent": true}
Run logs show: [bodacc] Phase — N lookups in Xs · with announcement · none · errors.
All input parameters
| Parameter | Type | If omitted | Description |
|---|---|---|---|
mode | string | — | Required. searchUrl · sirens. Legacy search / enrich normalized in code. |
searchUrls | string[] | [] | [Search URL mode] One URL per line from Pappers, Societe.com, or data.gouv. Required when mode is searchUrl. |
sirens | string[] | [] | [SIREN mode] One ID per line. Required when mode is sirens. |
maxResults | integer | 25 | [Search] Global cap for the whole run (all URL shards combined). 0 = no cap. Ignored when a date filter forces full pagination. |
requireDirigeant | boolean | true | [All] Drop rows with no named director. |
requireDirigeantPhysique | boolean | true | [All] Drop rows where the primary director has no usable birth year. |
excludeCategories | string[] | [] | [All] Exclude GE and/or ETI. Empty = no size filter. |
dateCreationMin | string | — | [All] Created on or after (YYYY-MM-DD). Console datepicker. Merged with URL dates by intersection. |
dateCreationMax | string | — | [All] Created on or before (YYYY-MM-DD). |
effectifsMin | integer | — | [All] Min employees (headcount band overlap). Merged with Pappers effectifs_min by intersection. |
effectifsMax | integer | — | [All] Max employees (headcount band overlap). |
dirigeantAgeMin | integer | — | [All] Min age of primary director (dirigeant_1). |
dirigeantAgeMax | integer | — | [All] Max age for the same rule. |
includeLastBodaccEvent | boolean | false | [All] Last BODACC notice per row. Slower on large runs — see section above. |
Console form: two sections — Your data (mode, URLs, SIRENs, max) and Optional filters & enrichments.
Dataset views: Companies Overview, CRM export, Agent export (LLM / MCP) — pick the view when you export; no input setting needed.
Dataset export views
| View | Best for |
|---|---|
| Companies Overview | Quick scan in the Console |
| CRM export | Lean French spreadsheet (Lemlist, HubSpot, Excel) |
| Agent export (LLM / MCP) | English keys + summary one-liner for automations |
All views read from the same dataset — each row stores the full register plus helper links and English aliases.
Agent field mapping (selected):
| Agent key | Source (full profile) |
|---|---|
company_id | siren |
company_name | nom_complet |
legal_name | nom_raison_sociale |
status | statut_entreprise (Active / Closed) |
director_name | dirigeant_1 |
director_age | dirigeant_1_age |
revenue_eur | chiffre_affaires |
annuaire_url | data_gouv_url |
maps_url | google_maps_url |
last_legal_event_date | derniere_annonce_date (when BODACC enabled) |
last_legal_event_type | derniere_annonce_type |
summary | Computed English one-liner (includes last legal event when BODACC enabled) |
Local output.csv follows the same profile: Full = wide CSV with helpers; CRM / Agent = profile columns only.
Minimal input: send only mode plus the fields you care about. Omitted keys are not injected into your stored run input; runtime defaults above apply in code only.
API-only parameters (not in Console)
| Parameter | Description |
|---|---|
verboseLogs | Per-SIREN pagination detail in live logs (default false). |
maxCompanyAgeYears | Deprecated — use dateCreationMin instead. |
nearPoint, nearCity, radiusKm, nearActivityCode, nearActiveOnly | Legacy radius search — prefer a Pappers URL with ville=. |
enrichMaxConcurrency, enrichMaxRequestsPerSecond, enrichDelayMs, proxyConfiguration | SIREN enrich tuning. |
Pappers URL parsing
Paste a Pappers search URL (pappers.fr/recherche?…) — supported parameters are converted automatically:
| Pappers parameter | Applied as |
|---|---|
en_activite=true | etat_administratif=A |
activite=62.01Z,62.02A | activite_principale (comma-separated NAF) |
forme_juridique=5710 | nature_juridique (5710 = SAS in SIRENE) |
ville=74160 | commune (INSEE code) |
date_creation_min/max | Post-filter on date_creation (DD-MM-YYYY → ISO) |
effectifs_min/max | Post-filter via tranche_effectif_salarie bands |
departement, region, code_postal | Passed to API |
age_dirigeant_min/max | Not supported — use dirigeantAgeMin / Max inputs |
Societe.com URL parsing
Paste a Societe.com search or export URL (societe.com/cgi-bin/search?… or societe.com/fichier?…). Filters are replayed against the official INSEE API (same dataset as Pappers/data.gouv — not Societe.com’s premium fields).
| Societe.com parameter | Applied as |
|---|---|
ftJlegal=5710 (repeatable) | nature_juridique — one API shard per code when several forms are selected |
ftAstatus=0 / 1 | etat_administratif=A / C (open / closed) |
q or champs | Text search q |
departement, ftDept | departement |
ftActivity, code_naf, ape | activite_principale |
ftRegion, ftZip | region, code_postal |
| Director age, CA filters in UI | Not in URL — use Actor optional filters |
Note: ftJlegal values are INSEE legal-form codes (same family as Pappers 5710 = SAS). A Societe.com URL with 20 legal forms runs 20 parallel API queries; results are deduplicated by SIREN.
Output example
{"siren": "732829320","nom_complet": "Example SAS","nom_raison_sociale": "EXAMPLE","activite_principale": "62.01Z","libelle_activite_principale": "Computer programming activities","adresse": "1 rue Example","code_postal": "75001","ville": "Paris","forme_juridique": "SAS","tva_intracommunautaire": "FR12345678901","dirigeant_1": "Jean Dupont","dirigeant_1_date_naissance": "1975-03","dirigeant_1_age": 51,"company_age_years": 11,"statut_entreprise": "Active","pappers_url": "https://www.pappers.fr/entreprise/732829320","google_maps_url": "https://www.google.com/maps/search/?api=1&query=..."}
Export as JSON, CSV, Excel, or HTML from the Dataset tab. A plain-text Run log (phase summary and final counts) is saved to the default key-value store as RUN_LOG.
How it works
- Parse input — mode-specific validation; no demo URLs or SIRENs injected.
- Query the register — Search URL (paginated filters) or SIREN (exact match with pagination).
- Normalize — flat row per company via
transform.js; NAF labels, director columns, financial bands. - Post-filter — creation date, director rules, company size, director age (same rules in all modes).
- Export — optional BODACC lookup · enrich row · push to Dataset (pick a view when downloading).
Throughput, proxy selection, and retries are handled in code — nothing to tune in the Input form. SIREN enrich on Apify Cloud automatically switches to Apify Proxy from 50 SIRENs upward so each worker uses its own IP.
Limitations
- No email or phone in the official register — director names and birth years only where published.
- Not every field on every row — micro-companies and redacted entries may lack financials or birth dates.
- Date-filtered searches scan all pages (no
maxResultscap) and may run 15–45+ minutes — increase timeout. - Pappers
age_dirigeantfilter is not in the public API — usedirigeantAgeMin/Max. - BODACC toggle adds ~0.4–1 s per saved row — use on small batches, not bulk
maxResults: 0runs.
Local development
npm installapify run --purge-none
apify runvalidates againststorage/key_value_stores/default/INPUT.json— copy from.actor/INPUT.jsonor use--input-file=./input.json.- Local cloud-off runs read root
input.jsonfirst, then KV input (src/main.js). - Tests:
npm test(59 unit tests, no network). CI smoke:npm run test:smoke(Pappers, gouv, Societe.com 20 shards). Full integration:npm run test:integration.
$npm run run:local # copies input.json → storage/.../INPUT.json, then apify run
Is it legal to scrape French company data?
This Actor reads only the French administration’s public register API (recherche-entreprises.api.gouv.fr) — not private websites or paywalled Pappers pages. Data is published under official open-data rules where diffusion is allowed.
Rows may contain personal data (director names). Ensure your use complies with GDPR and your lawful basis. See Apify’s guide on legal scraping.
FAQ
Do post-filters apply in SIREN mode?
Yes — creation-date bounds, headcount bounds, requireDirigeant, requireDirigeantPhysique, excludeCategories, and director age apply in both modes. Filtered rows are not written; check run logs for filtered out vs not found.
How do I get a CRM or Agent spreadsheet?
Open the Dataset tab after the run → choose CRM export or Agent export (LLM / MCP) → Export (JSON, CSV, Excel). No input setting required.
Do filters replace my Pappers Search URL?
No. Filters narrow the companies returned from the URL or SIREN list you provide — sector and region always come from your link.
What does includeLastBodaccEvent do?
Adds the latest official notice from bodacc.fr per company (derniere_annonce_* fields). Off by default. Enable for small runs; for full announcement history use the dedicated BODACC Scraper.
Do I need an API key?
No.
How does director age filtering work?
Age is computed from the primary director’s birth year (dirigeant_1). Missing or [NON-DIFFUSIBLE] birth years are excluded when age bounds are set or when requireDirigeantPhysique is true.
Why scan all pages when a date filter is set?
The API caps at 10 000 results per query with no date sort. Regional sharding (13 parallel queries) avoids missing recent companies.
Why do logs mention retries?
The public service can slow down under load. The Actor waits and retries automatically — no Input tuning required.
How long will my SIREN list take?
Paste your IDs and run — pacing is automatic. Short lists (< 50) use a single IP; larger lists enable Apify Proxy in the background (~11 minutes for 5 200 SIRENs in production). Raise timeout under Run options for very large jobs (10k+).
Also available
- French Accountants Directory Scraper — official expert-comptable register (city, nationwide, or profile URLs)
- Notaires.fr Directory Scraper — official notary offices — full France or your URLs
- French Establishments Scraper — one row per SIRET (branch address, NAF, headcount, status) for your SIREN list
- French Legal Announcements Scraper (BODACC) — full BODACC history by date or SIREN (this Actor only adds the last notice when
includeLastBodaccEventis on)
Browse all: apify.com/corent1robert
Support
Questions, custom automation, or integrations: corentin@outreacher.fr
Use the Issues tab on this Actor’s Apify page after publish.