Recherche Entreprises Scraper avatar
Recherche Entreprises Scraper

Pricing

$10.00/month + usage

Go to Apify Store
Recherche Entreprises Scraper

Recherche Entreprises Scraper

Extract comprehensive French company data from data-gouv.fr. Search companies using filters (activity, creation date, revenue, location) with automatic pagination. Enriches data with additional information from annuaire-entreprises.data.gouv.fr including legal details, directors, and financial data.

Pricing

$10.00/month + usage

Rating

0.0

(0)

Developer

Corentin Robert

Corentin Robert

Maintained by Community

Actor stats

0

Bookmarked

6

Total users

3

Monthly active users

4 days ago

Last modified

Share

Last updated: January 2026

Extract comprehensive French company data from the official government API. Search companies using advanced filters (activity code, revenue, creation date, location) and automatically enrich data with legal information, directors, and financial details.


🎯 Why use this scraper?

Extract complete French company data in minutes with official government sources. Perfect for business intelligence, lead generation, market research, and compliance monitoring.

The Problem: Manually searching French companies on Pappers.fr or data.gouv.fr is time-consuming, limited by pagination, and doesn't provide structured data for analysis. Extracting comprehensive company information (directors, finances, legal details) requires visiting multiple pages and websites.

The Solution: This Actor queries the official French government API (recherche-entreprises.api.gouv.fr) with advanced filters, automatically handles pagination (up to 1000 pages), and enriches data with comprehensive information from annuaire-entreprises.data.gouv.fr - all in one automated workflow.

Key Benefits:

  • 10-100x faster than manual extraction
  • 📊 Complete structured data ready for analysis
  • 🔄 Automatic pagination - no manual page clicking
  • Data enrichment - directors, finances, legal details automatically fetched
  • 🔗 URL input support - just paste a Pappers.fr URL and go

✅ What you get

Complete company data including:

Basic Information (from API)

  • Company Identification: SIREN, full name, legal name
  • Location: Complete address, postal code, city, department, region, coordinates
  • Activity: NAF/APE code and activity label
  • Administrative: Status (active/closed), creation date, closure date
  • Establishments: Number of establishments and open establishments
  • Company Category: PME/ETI/GE classification, legal nature
  • Financial Data: Revenue and net result (latest year available)

Enriched Information (when enrichData: true)

  • Legal Information: SIRET siège, TVA Intracommunautaire, legal form
  • Employee Data: Employee count range, company size category
  • Administrative Dates: Insee registration date, RNE registration date
  • Directors: Up to 5 directors with names, roles (Presidents, Directors, Managers), excluding auditors
  • Additional Details: Complete postal address, activity code, collective agreements

🚀 Key Features

📊 Complete Data Extraction

  • Textual search - Search by company name, address, directors, or elected officials
  • Geographic search - Find companies near coordinates within configurable radius
  • Advanced filters - Activity code (NAF/APE), revenue, creation date, location (postal code, department, region)
  • Headquarters filtering - Filter by region with siege=true to get only companies with headquarters in the specified region
  • Automatic pagination - Fetches all pages automatically (up to 1000 pages)
  • Data enrichment - Comprehensive enrichment from annuaire-entreprises.data.gouv.fr

🔗 URL Input Support (Simplest Method)

  • Paste Pappers.fr URLs - Just copy-paste your search URL, all parameters extracted automatically
  • Paste data.gouv.fr URLs - Also supports official API URLs
  • Automatic parameter extraction - No manual configuration needed
  • Date format conversion - Automatically converts DD-MM-YYYY to YYYY-MM-DD

⚡ Performance & Reliability

  • Optimized parallel processing - 5-8 concurrent requests for fast data enrichment
  • Rate limiting - Respects API limits (7 requests/second) automatically
  • Error handling - Robust retry logic with exponential backoff
  • CSV export - Automatic CSV generation for local testing

🎯 Platform Advantages

  • Monitoring & Logs - Real-time execution monitoring with detailed progress logs
  • API Access - Access your data programmatically via Apify API
  • Scheduling - Set up automated runs to track company changes
  • Integrations - Connect to Make.com, Zapier, Google Sheets, and more
  • Scalability - Handle large-scale searches with cloud infrastructure
  • Data Storage - Secure dataset storage with multiple export formats (JSON, CSV, Excel, HTML)

💼 Use Cases and Client Benefits

🏢 For Business Intelligence & Market Research

The Problem: Market researchers need to analyze companies by sector, region, or size. Manual data collection from multiple sources is slow and error-prone.

The Solution: Extract thousands of companies with filters (activity code, region, revenue) in minutes. Get complete structured data ready for analysis.

Client Benefits:

  • 📊 Market analysis - Identify companies by sector, region, or size instantly
  • 📈 Competitive intelligence - Track competitors and market trends
  • 💰 Lead generation - Find potential clients by activity and location
  • Time saved - 10-100x faster than manual extraction

ROI: Reduce market research time from days to hours. Extract 10,000 companies in 30 minutes instead of 2 weeks manually.

🏦 For Sales & Business Development

The Problem: Sales teams need to find potential clients by industry, location, or company size. Manual prospecting is inefficient.

The Solution: Generate targeted lead lists with filters (activity code, region, revenue range). Get complete company information including directors for personalized outreach.

Client Benefits:

  • 🎯 Targeted prospecting - Find companies by exact criteria (activity, region, revenue)
  • 👥 Director information - Get decision-maker names and roles for personalized outreach
  • 📍 Geographic targeting - Find companies in specific regions or near locations
  • 💼 Complete company profiles - All information needed for qualification in one dataset

ROI: Generate 1,000 qualified leads in 15 minutes instead of days of manual research.

⚖️ For Compliance & Due Diligence

The Problem: Compliance teams need to verify company information, check legal status, and track administrative changes. Manual verification is time-consuming.

The Solution: Extract complete company data including legal information, registration dates, and administrative status. Monitor changes with scheduled runs.

Client Benefits:

  • Legal verification - Get SIRET, TVA, legal form, registration dates
  • 📋 Status monitoring - Track active/closed status and closure dates
  • 🔄 Automated monitoring - Schedule runs to track company changes
  • 📊 Complete audit trail - All administrative information in structured format

ROI: Verify 500 companies in 10 minutes instead of hours of manual checking.

🔍 For CRM Enrichment

The Problem: CRM databases have incomplete company information. Manual enrichment is slow and expensive.

The Solution: Enrich existing company databases with official government data. Get complete profiles including directors, finances, and legal details.

Client Benefits:

  • 📊 Data completeness - Fill missing fields with official data
  • 👥 Director information - Add decision-makers to CRM records
  • 💰 Financial data - Enrich with revenue and financial results
  • 🔄 Automated updates - Keep CRM data fresh with scheduled runs

ROI: Enrich 10,000 CRM records in 1 hour instead of weeks of manual work.


📈 Concrete Results: Before vs. After

Before (without the scraper)

  • ⏱️ Time required: 2-4 hours per 100 companies manually
  • Limited data: Only basic information from search results
  • No enrichment: Directors and financial data require separate research
  • Pagination issues: Manual clicking through pages
  • No automation: Can't schedule or automate searches
  • Error-prone: Manual copy-paste leads to mistakes

After (with the scraper)

  • Time saved: 10-100x faster - 10,000 companies in 30-60 minutes
  • Complete data: All fields extracted automatically
  • Automatic enrichment: Directors, finances, legal details included
  • Automatic pagination: Handles thousands of pages automatically
  • Fully automated: Schedule runs, API access, integrations
  • Error-free: Structured data, no manual errors

Time saved: 95-99% reduction in data collection time
Speed: Extract 1,000 companies in 5-10 minutes
Quality improvement: 100% structured data, no manual errors


💰 Costs and Optimization

💵 Actual Cost Breakdown

ServiceUsageCost
Actor compute1,000 companies (10 min)~$0.10
Dataset writes1,000 items~$0.01
Total per 1,000 companies~$0.11

Cost per 10,000 companies: ~$1.10
Cost per 100,000 companies: ~$11.00

💡 Optimization Tips

  • Disable enrichment (enrichData: false) for faster scraping if you only need basic API data
  • Use per_page: 25 for faster scraping when dealing with large result sets
  • Set maxResults to limit results for testing or specific use cases
  • Use URL input - fastest way to configure searches (just paste Pappers.fr URL)

📋 Complete Data Fields Extracted

CategoryField NameDescriptionExampleSource
Company IdentificationsirenSIREN number (9 digits)"412730269"API
nom_completComplete company name"ARINFO I-MAGINER"API
nom_raison_socialeLegal company name"ARINFO I-MAGINER"API
LocationadresseComplete address"1-5 1 RUE EMILE MASSON 44000 NANTES"API/Enriched
code_postalPostal code"44000"API
villeCity name"NANTES"API
departementDepartment code"44"API
regionRegion code"52"API
latitudeLatitude of headquarters"47.213863"API
longitudeLongitude of headquarters"-1.546537"API
Activityactivite_principaleNAF/APE activity code"58.29C"API
libelle_activite_principaleActivity label"Édition de logiciels applicatifs"API/Enriched
Administrativeetat_administratifStatus (A=Active, C=Closed)"A"API
date_creationCreation date"1997-06-25"API
date_fermetureClosure date (if applicable)""API
Establishmentsnombre_etablissementsTotal establishments25API
nombre_etablissements_ouvertsOpen establishments8API
Company Categorycategorie_entreprisePME/ETI/GE"PME"API
nature_juridiqueLegal nature code"5710"API
Financial Datachiffre_affairesRevenue (latest year)3049771API
resultat_netNet result (latest year)-120060API
annee_financesFinancial year"2021"API
Legal Information (Enriched)siret_siegeSIRET of headquarters"41273026900179"Enriched
tva_intracommunautaireVAT number"FR78412730269"Enriched
forme_juridiqueLegal form"SAS, société par actions simplifiée"Enriched
Employee Data (Enriched)effectif_salarieEmployee count range"50 à 99 salariés, en 2023"Enriched
taille_structureCompany size category"Petite ou Moyenne Entreprise (PME), en 2023"Enriched
Administrative Dates (Enriched)date_inscription_inseeInsee registration date"25/06/1997"Enriched
date_immatriculation_rneRNE registration date"25/06/1997"Enriched
Directors (Enriched)dirigeants_arrayArray of directors (up to 5)[{"nom": "Philippe PERES", "role": "PRESIDENT DE SAS"}]Enriched
dirigeantsDirectors as text"Philippe PERES (PRESIDENT DE SAS)"Enriched
dirigeant_1 to dirigeant_5Individual director names"Philippe PERES"Enriched
role_1 to role_5Individual director roles"PRESIDENT DE SAS"Enriched
Additional Details (Enriched)adresse_postaleComplete postal address"1-5 1 RUE EMILE MASSON 44000 NANTES"Enriched
convention_collectiveCollective agreement"IDCC 1486"Enriched

💡 How to Use the Data

📊 Data Analysis

  • Market research: Analyze company distribution by sector, region, or size
  • Competitive analysis: Identify competitors by activity code and location
  • Trend analysis: Track company creation/closure trends over time

🎯 Lead Generation

  • Targeted prospecting: Filter by activity, region, revenue to find ideal clients
  • Director outreach: Use director information for personalized sales emails
  • Geographic targeting: Find companies in specific regions or near locations

🔄 CRM Integration

  • Data enrichment: Fill missing fields in CRM with official government data
  • Automated updates: Schedule runs to keep CRM data fresh
  • Data validation: Verify company information against official sources

⚖️ Compliance & Due Diligence

  • Legal verification: Check SIRET, TVA, legal form, registration dates
  • Status monitoring: Track active/closed status and closure dates
  • Audit trails: Maintain complete administrative information records

📖 Input Configuration

The easiest way to use this Actor: Simply paste a URL from Pappers.fr or data.gouv.fr. The Actor will automatically parse the URL and extract all search parameters. No need to manually configure individual parameters!

Example with Pappers URL:

{
"url": "https://www.pappers.fr/recherche?en_activite=true&siege=true&region=76&activite=47.91B",
"enrichData": true,
"maxResults": 100
}

Example with data.gouv URL:

{
"url": "https://recherche-entreprises.api.gouv.fr/search?q=&activite_principale=58.29C&etat_administratif=A&ca_min=100000&ca_max=10000000",
"enrichData": true
}

The Actor automatically:

  • Detects if the URL is from Pappers.fr or data.gouv.fr
  • Parses all search parameters from the URL
  • Converts date formats (DD-MM-YYYY from Pappers to YYYY-MM-DD)
  • Ignores all individual search parameters (query, activite, date_creation_min, etc.) if an URL is provided
  • Keeps only configuration parameters (enrichData, maxResults, per_page, page) that you can still override

Important: When you provide a URL, all individual search parameters (query, activite, date_creation_min, date_creation_max, chiffre_affaires_min, chiffre_affaires_max, code_postal, departement, region, etc.) are ignored. Only the search parameters from the URL are used. However, you can still provide configuration parameters like enrichData, maxResults, per_page, and page to customize the scraping behavior.

🔍 Search Configuration

ParameterTypeDefaultDescription
searchTypestring"textual"Type of search: "textual" or "geographic"
querystring""Search terms (company name, address, directors, elected officials). Can be empty if using filters only.
en_activiteboolean-Filter by administrative status: true for active companies, false for closed companies
activitestring-NAF/APE activity code (e.g., "58.29C" for software publishing)
date_creation_minstring-Minimum creation date (format: "YYYY-MM-DD" or "DD-MM-YYYY")
date_creation_maxstring-Maximum creation date (format: "YYYY-MM-DD" or "DD-MM-YYYY")
chiffre_affaires_minnumber-Minimum revenue (chiffre d'affaires) in euros
chiffre_affaires_maxnumber-Maximum revenue (chiffre d'affaires) in euros
code_postalstring-Filter by postal code
departementstring-Filter by department code (e.g., '75' for Paris)
regionstring-Filter by region code. When combined with siege=true, filters only companies whose headquarters (siège social) is in this region.
siegeboolean-When true and combined with region, filters only companies whose headquarters (siège social) is in the specified region. This ensures that only companies with their main office in the region are returned, not companies that just have establishments in the region.
ParameterTypeDefaultDescription
latitudenumber/string-Latitude for geographic search (required for geographic search type). Valid range: -90 to 90. Example: 48.8566
longitudenumber/string-Longitude for geographic search (required for geographic search type). Valid range: -180 to 180. Example: 2.3522
radiusnumber5Search radius in kilometers (max 50km)

📄 Pagination

ParameterTypeDefaultDescription
pagenumber1Page number to start from
per_pagenumber10Number of results per page (max 25). Higher values = faster scraping but more memory usage.
maxResultsnumber-Maximum total results to fetch (optional, leave empty to fetch all results). Useful for testing or limiting large datasets.

✨ Data Enrichment

ParameterTypeDefaultDescription
enrichDatabooleantrueEnable comprehensive data enrichment from annuaire-entreprises.data.gouv.fr. When enabled, fetches: (1) Financial data: Latest year revenue (chiffre d'affaires), net result (résultat net), financial year; (2) Directors: Up to 5 directors with names and roles (President, Director, Manager), excluding auditors; (3) Company details: SIRET siège, TVA Intracommunautaire, Legal form, Employee count, Company size (PME/ETI/GE), Collective agreements, Registration dates (Insee/RNE), Complete address, Activity label, NAF/APE code. Note: Enrichment increases scraping time but provides much more detailed data.

🚀 Installation and Usage

Local Installation

cd recherche-entreprises-scraper
npm install
npm start

The Actor will read from input.json in the local directory. Results will be saved to output.csv and storage/datasets/default/.

Apify Platform

  1. Push the Actor: apify push
  2. Configure input in the Apify Console (or use the URL input method)
  3. Run the Actor
  4. Download results from the Dataset tab (JSON, CSV, Excel, HTML formats available)

Example Inputs

URL Input (Recommended):

{
"url": "https://www.pappers.fr/recherche?en_activite=true&siege=true&region=76&activite=47.91B",
"enrichData": true,
"maxResults": 100
}

Manual Parameters:

{
"searchType": "textual",
"query": "",
"en_activite": true,
"activite": "58.29C",
"date_creation_min": "1990-12-05",
"date_creation_max": "2012-12-05",
"chiffre_affaires_min": 100000,
"chiffre_affaires_max": 10000000,
"per_page": 25,
"enrichData": true
}

Geographic Search:

{
"searchType": "geographic",
"latitude": 48.8566,
"longitude": 2.3522,
"radius": 10,
"en_activite": true,
"per_page": 25,
"enrichData": true
}

📊 Output Format

The Actor returns company data in JSON format. Each result contains detailed information about the company.

Example Output

{
"siren": "412730269",
"nom_complet": "ARINFO I-MAGINER",
"nom_raison_sociale": "ARINFO I-MAGINER",
"date_creation": "1997-06-25",
"etat_administratif": "A",
"activite_principale": "58.29C",
"libelle_activite_principale": "Édition de logiciels applicatifs",
"adresse": "1-5 1 RUE EMILE MASSON 44000 NANTES",
"code_postal": "44000",
"ville": "NANTES",
"departement": "44",
"region": "52",
"nombre_etablissements": 25,
"nombre_etablissements_ouverts": 8,
"siret_siege": "41273026900179",
"tva_intracommunautaire": "FR78412730269",
"forme_juridique": "SAS, société par actions simplifiée",
"effectif_salarie": "50 à 99 salariés, en 2023",
"taille_structure": "Petite ou Moyenne Entreprise (PME), en 2023",
"convention_collective": "IDCC 1486",
"date_inscription_insee": "25/06/1997",
"date_immatriculation_rne": "25/06/1997",
"nature_juridique": "5710",
"categorie_entreprise": "PME",
"latitude": "47.213863",
"longitude": "-1.546537",
"chiffre_affaires": 3049771,
"resultat_net": -120060,
"annee_finances": "2021",
"dirigeants_array": [
{
"nom": "Philippe PERES",
"role": "PRESIDENT DE SAS",
"date_naissance": ""
}
],
"dirigeants": "Philippe PERES (PRESIDENT DE SAS)",
"dirigeant_1": "Philippe PERES",
"role_1": "PRESIDENT DE SAS"
}

💡 Tips for Best Results

  • URL Input - The easiest way to use this Actor is to copy-paste a URL from Pappers.fr or data.gouv.fr. The Actor will automatically parse all parameters and reproduce the exact same search.
  • Date format - You can use either "YYYY-MM-DD" or "DD-MM-YYYY" format for dates. URLs from Pappers.fr use "DD-MM-YYYY" format which is automatically converted.
  • Activity codes - Use the full NAF/APE code (e.g., "58.29C") for precise filtering
  • Headquarters filtering - Use siege=true with region to get only companies with headquarters in the specified region
  • Rate limiting - The Actor automatically respects the API limit of 7 requests per second
  • Pagination - The Actor automatically fetches all pages (up to 1000) unless maxResults is specified
  • Date filtering - Note that date filtering is done post-processing as the API doesn't support it directly
  • Empty query - You can use an empty query string ("") if you only want to filter by other parameters
  • Enrichment - Enable enrichData for comprehensive data including directors and legal information. Disable it for faster scraping if you only need basic company data.
  • Performance - Use per_page: 25 for faster scraping when dealing with large result sets

⚠️ API Limitations

  • Maximum 7 requests per second (automatically handled by the Actor)
  • Maximum 25 results per page
  • Some filters may not be available in the API and are handled post-processing
  • API may report limited total pages, but the Actor can fetch up to 1000 pages

Yes, this Actor uses the official French government API (recherche-entreprises.api.gouv.fr) which is publicly accessible. The API is open and free to use. The data enrichment feature also uses the official annuaire-entreprises.data.gouv.fr website, which is a public service.


📞 Support

If you encounter any issues or have questions:

  • Check the Issues tab for known problems and solutions
  • Review the Input and Output tabs for detailed schema information
  • Contact support through Apify Console if you need additional help

We're always open to feedback and suggestions for improving the Actor. Don't hesitate to reach out if you have ideas or encounter any problems!


Last updated: January 2026