Recherche Entreprises Scraper avatar
Recherche Entreprises Scraper

Pricing

$10.00/month + usage

Go to Apify Store
Recherche Entreprises Scraper

Recherche Entreprises Scraper

Extract comprehensive French company data from data-gouv.fr. Search companies using filters (activity, creation date, revenue, location) with automatic pagination. Enriches data with additional information from annuaire-entreprises.data.gouv.fr including legal details, directors, and financial data.

Pricing

$10.00/month + usage

Rating

0.0

(0)

Developer

Corentin Robert

Corentin Robert

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract comprehensive French company data from the official government API. Search companies using advanced filters (activity code, revenue, creation date, location) and automatically enrich data with legal information, directors, and financial details.

What is Recherche Entreprises Scraper?

Recherche Entreprises Scraper is a powerful tool that queries the official French government API (recherche-entreprises.api.gouv.fr) to extract company data. It allows you to search for French companies using multiple filters similar to Pappers.fr, including activity codes (NAF/APE), creation dates, revenue ranges, administrative status, and geographic location. The Actor automatically enriches company data with additional information from annuaire-entreprises.data.gouv.fr, providing comprehensive business intelligence.

What can Recherche Entreprises Scraper do?

This Actor provides several powerful features that make French company data extraction effortless:

🚀 Key Features

  • Textual search - Search companies by name, address, directors, or elected officials
  • Geographic search - Find companies near specific coordinates (latitude/longitude) within a configurable radius
  • Advanced filters - Filter by activity code (NAF/APE), revenue, creation date, administrative status, postal code, department, and region
  • Automatic pagination - Automatically fetch all pages of results (up to 1000 pages, handles API limitations)
  • Data enrichment - Automatically enrich company data with additional information from annuaire-entreprises.data.gouv.fr:
    • Legal information (SIRET siège, TVA Intracommunautaire, Forme juridique)
    • Employee data (Effectif salarié, Taille structure - PME/ETI/GE)
    • Administrative dates (Date inscription Insee, Date immatriculation RNE)
    • Directors information (up to 5 directors: Presidents, Directors, Managers - excluding commissaires aux comptes)
    • Additional address and activity details
  • Optimized performance - Parallel processing with 50 concurrent requests for fast data enrichment
  • Rate limiting - Respects API limits with intelligent request management
  • Date filtering - Filter results by company creation date (post-processing)
  • CSV export - Automatic CSV generation for local testing and analysis

🎯 Platform Advantages

Your Actor + the Apify platform. They come as a package. This scraper benefits from:

  • Monitoring & Logs: Real-time execution monitoring with detailed progress logs
  • API Access: Access your data programmatically via Apify API
  • Scheduling: Set up automated runs on a schedule to track company changes
  • Integrations: Connect to Make.com, Zapier, Google Sheets, and more
  • Scalability: Handle large-scale searches with cloud infrastructure
  • Data Storage: Secure dataset storage with multiple export formats (JSON, CSV, Excel, HTML)
  • Proxy Management: Automatic proxy rotation for reliable API access

What data can Recherche Entreprises Scraper extract?

The Actor extracts comprehensive data from French companies. Here's what you can extract:

Basic Information (from API)

Data CategoryFields ExtractedDescription
Company Identificationsiren, nom_complet, nom_raison_socialeCompany identifiers and names
Locationadresse, code_postal, ville, departement, region, latitude, longitudeComplete address and coordinates
Activityactivite_principale, libelle_activite_principaleNAF/APE code and activity label
Administrativeetat_administratif, date_creation, date_fermetureStatus and dates
Establishmentsnombre_etablissements, nombre_etablissements_ouvertsNumber of establishments
Company Categorycategorie_entreprise, nature_juridiquePME/ETI/GE and legal nature
Financial Datachiffre_affaires, resultat_net, annee_financesRevenue and net result (latest year)

Enriched Information (if enrichData: true)

Data CategoryFields ExtractedDescription
Legal Informationsiret_siege, tva_intracommunautaire, forme_juridiqueSIRET headquarters, VAT number, legal form
Employee Dataeffectif_salarie, taille_structureEmployee count range and company size
Administrative Datesdate_inscription_insee, date_immatriculation_rneRegistration dates
Directorsdirigeants_array, dirigeantsUp to 5 directors with roles (Presidents, Directors, Managers)
Additional Detailsadresse_postale, code_naf_ape, convention_collectiveComplete address, activity code, collective agreement

Input Parameters

The Actor accepts the following input parameters organized in clear sections:

The easiest way to use this Actor: Simply paste a URL from Pappers.fr or data.gouv.fr. The Actor will automatically parse the URL and extract all search parameters. No need to manually configure individual parameters!

  • url (string): URL from Pappers.fr or data.gouv.fr to automatically extract search parameters. If provided, all parameters from the URL will be used and override individual parameters.

Example with Pappers URL:

{
"url": "https://www.pappers.fr/recherche?en_activite=true&activite=58.29C&date_creation_min=05-12-1990&date_creation_max=05-12-2012&chiffre_affaires_min=100000&chiffre_affaires_max=10000000",
"enrichData": true
}

Example with data.gouv URL:

{
"url": "https://recherche-entreprises.api.gouv.fr/search?q=&activite_principale=58.29C&etat_administratif=A&ca_min=100000&ca_max=10000000",
"enrichData": true
}

The Actor automatically:

  • Detects if the URL is from Pappers.fr or data.gouv.fr
  • Parses all search parameters from the URL
  • Converts date formats (DD-MM-YYYY from Pappers to YYYY-MM-DD)
  • Ignores all individual search parameters (query, activite, date_creation_min, etc.) if an URL is provided
  • Keeps only configuration parameters (enrichData, maxResults, per_page, page) that you can still override

Important: When you provide a URL, all individual search parameters (query, activite, date_creation_min, date_creation_max, chiffre_affaires_min, chiffre_affaires_max, code_postal, departement, region, etc.) are ignored. Only the search parameters from the URL are used. However, you can still provide configuration parameters like enrichData, maxResults, per_page, and page to customize the scraping behavior.

🔍 Search Configuration

  • searchType (string): Type of search - "textual" or "geographic" (default: "textual")
  • query (string): Search terms (company name, address, directors, elected officials). Can be empty if using filters only.
  • en_activite (boolean): Filter by administrative status - true for active companies, false for closed companies
  • activite (string): NAF/APE activity code (e.g., "58.29C" for software publishing)
  • date_creation_min (string): Minimum creation date (format: "YYYY-MM-DD" or "DD-MM-YYYY")
  • date_creation_max (string): Maximum creation date (format: "YYYY-MM-DD" or "DD-MM-YYYY")
  • chiffre_affaires_min (number): Minimum revenue (chiffre d'affaires) in euros
  • chiffre_affaires_max (number): Maximum revenue (chiffre d'affaires) in euros
  • code_postal (string): Filter by postal code
  • departement (string): Filter by department code (e.g., '75' for Paris)
  • region (string): Filter by region code
  • latitude (number): Latitude (required for geographic search type)
  • longitude (number): Longitude (required for geographic search type)
  • radius (number): Search radius in kilometers (max 50km, default: 5km)

📄 Pagination

  • page (number): Page number to start from (default: 1)
  • per_page (number): Number of results per page (max 25, default: 10)
  • maxResults (number): Maximum total results to fetch (optional, leave empty to fetch all)

✨ Data Enrichment

  • enrichData (boolean): Enable comprehensive data enrichment from annuaire-entreprises.data.gouv.fr (default: true). When enabled, the Actor fetches detailed data in three categories:

    1. Financial Information:

    • Latest year revenue (chiffre d'affaires)
    • Net result (résultat net)
    • Financial year (année finances)

    2. Directors Information:

    • Up to 5 directors with full names
    • Roles (President, Director, Manager, etc.)
    • Excludes auditors (commissaires aux comptes)
    • Stored in separate columns: dirigeant_1, role_1, dirigeant_2, role_2, etc.

    3. Company Information:

    • SIRET siège (headquarters SIRET)
    • TVA Intracommunautaire (VAT number)
    • Legal form (forme juridique)
    • Employee count range (effectif salarié)
    • Company size category (PME/ETI/GE - taille structure)
    • Collective agreements (conventions collectives)
    • Registration dates (date inscription Insee, date immatriculation RNE)
    • Complete postal address (adresse postale)
    • Activity label (libellé activité principale)
    • NAF/APE code (code NAF/APE)

    Note: Enrichment increases scraping time but provides much more detailed and complete data. Disable it if you only need basic company information from the API.

Example Input

Just copy-paste a URL from Pappers.fr or data.gouv.fr:

{
"url": "https://www.pappers.fr/recherche?en_activite=true&activite=58.29C&date_creation_min=05-12-1990&date_creation_max=05-12-2012&chiffre_affaires_min=100000&chiffre_affaires_max=10000000",
"enrichData": true
}

The Actor will automatically extract all parameters from the URL and reproduce the exact same search. This is the recommended method for ease of use.

Manual Parameters Example (Advanced)

{
"searchType": "textual",
"query": "",
"en_activite": true,
"activite": "58.29C",
"date_creation_min": "1990-12-05",
"date_creation_max": "2012-12-05",
"chiffre_affaires_min": 100000,
"chiffre_affaires_max": 10000000,
"per_page": 25,
"enrichData": true
}

Geographic Search Example

{
"searchType": "geographic",
"latitude": 48.8566,
"longitude": 2.3522,
"radius": 10,
"en_activite": true,
"per_page": 25,
"enrichData": true
}

Output

The Actor returns company data in JSON format. Each result contains detailed information about the company.

Example Output

{
"siren": "412730269",
"nom_complet": "ARINFO I-MAGINER",
"nom_raison_sociale": "ARINFO I-MAGINER",
"date_creation": "1997-06-25",
"etat_administratif": "A",
"activite_principale": "58.29C",
"libelle_activite_principale": "Édition de logiciels applicatifs",
"adresse": "1-5 1 RUE EMILE MASSON 44000 NANTES",
"code_postal": "44000",
"ville": "NANTES",
"departement": "44",
"region": "52",
"nombre_etablissements": 25,
"nombre_etablissements_ouverts": 8,
"siret_siege": "41273026900179",
"tva_intracommunautaire": "FR78412730269",
"forme_juridique": "SAS, société par actions simplifiée",
"effectif_salarie": "50 à 99 salariés, en 2023",
"taille_structure": "Petite ou Moyenne Entreprise (PME), en 2023",
"convention_collective": "IDCC 1486",
"date_inscription_insee": "25/06/1997",
"date_immatriculation_rne": "25/06/1997",
"nature_juridique": "5710",
"categorie_entreprise": "PME",
"latitude": "47.213863",
"longitude": "-1.546537",
"chiffre_affaires": 3049771,
"resultat_net": -120060,
"annee_finances": "2021",
"dirigeants_array": [
{
"nom": "Philippe PERES",
"role": "PRESIDENT DE SAS",
"date_naissance": ""
}
],
"dirigeants": "Philippe PERES (PRESIDENT DE SAS)"
}

Tips for Best Results

  • URL Input - The easiest way to use this Actor is to copy-paste a URL from Pappers.fr or data.gouv.fr. The Actor will automatically parse all parameters and reproduce the exact same search.
  • Date format - You can use either "YYYY-MM-DD" or "DD-MM-YYYY" format for dates. URLs from Pappers.fr use "DD-MM-YYYY" format which is automatically converted.
  • Activity codes - Use the full NAF/APE code (e.g., "58.29C") for precise filtering
  • Rate limiting - The Actor automatically respects the API limit of 7 requests per second
  • Pagination - The Actor automatically fetches all pages (up to 1000) unless maxResults is specified
  • Date filtering - Note that date filtering is done post-processing as the API doesn't support it directly
  • Empty query - You can use an empty query string ("") if you only want to filter by other parameters
  • Enrichment - Enable enrichData for comprehensive data including directors and legal information. Disable it for faster scraping if you only need basic company data.
  • Performance - Use per_page: 25 for faster scraping when dealing with large result sets

API Limitations

  • Maximum 7 requests per second (automatically handled by the Actor)
  • Maximum 25 results per page
  • Some filters may not be available in the API and are handled post-processing
  • API may report limited total pages, but the Actor can fetch up to 1000 pages

Use Cases

This Actor is perfect for:

  • Business Intelligence - Extract company data for market analysis and research
  • Lead Generation - Find companies by activity, location, or revenue
  • Competitor Analysis - Identify companies in specific sectors and regions
  • Compliance Monitoring - Track company status and administrative information
  • CRM Enrichment - Enrich existing company databases with official data
  • Market Research - Analyze company distribution by sector, region, or size
  • Due Diligence - Gather comprehensive company information including directors and legal details

Yes, this Actor uses the official French government API (recherche-entreprises.api.gouv.fr) which is publicly accessible. The API is open and free to use. The data enrichment feature also uses the official annuaire-entreprises.data.gouv.fr website, which is a public service.

Support

If you encounter any issues or have questions:

  • Check the Issues tab for known problems and solutions
  • Review the Input and Output tabs for detailed schema information
  • Contact support through Apify Console if you need additional help

We're always open to feedback and suggestions for improving the Actor. Don't hesitate to reach out if you have ideas or encounter any problems!