# Pappers.fr Company Scraper (`epicscrapers/pappers-scraper`) Actor

Scrape French company data from Pappers.fr including SIREN numbers, legal forms, financial records, directors, and publications. Supports bulk searches by keyword or Pappers.fr URL filters for precise, targeted data extraction.

- **URL**: https://apify.com/epicscrapers/pappers-scraper.md
- **Developed by:** [Epic Scrapers](https://apify.com/epicscrapers) (community)
- **Categories:** Lead generation, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Pappers.fr Company Scraper

Scrape **French company data from Pappers.fr** — the comprehensive French business directory. This Apify Actor lets you search and extract detailed company profiles including legal information, financial records, directors, and publications using either plain text queries or the same rich filters available on pappers.fr/recherche. Whether you're building a lead list, monitoring competitors, or analyzing the French business landscape, this tool turns Pappers.fr's public data into structured, downloadable datasets.

### What can the Pappers.fr Company Scraper do?

- 🔍 **Search by keyword** — Enter company names, industries, or any search term to find matching French companies
- 🌐 **Use Pappers.fr URL filters** — Build advanced searches on pappers.fr/recherche (by department, city, NAF code, legal status, and more), copy the URL, and paste it in
- 📋 **Search across multiple bases** — Find results in companies, directors, and publications simultaneously
- 📈 **Unlimited results** — Automatically paginates through all available results for comprehensive data collection
- 🎯 **Precision control** — Choose between standard and strict search modes to fine-tune your results
- 🔄 **Multiple queries in one run** — Submit several searches at once and get merged results in a single dataset

### What data can you extract from Pappers.fr?

Each company result includes a rich profile with dozens of data fields. Here are the key categories:

| Data Category | Examples |
|---|---|
| **Identifiers** | SIREN, SIRET, SIREN format, diffusable status |
| **Company Name** | Legal denomination, trade name, brand name |
| **Legal Structure** | Legal form, status (active/ceased), registration date |
| **Address** | Street address, postal code, city, country |
| **Financial Data** | Revenue, profit, employee count, financial year |
| **Management** | Directors, their roles, addresses, birth details |
| **Establishments** | All establishment locations with SIRETs, activities, statuses |
| **Publications** | Legal publications (JAL, BODACC, announcements) |
| **Activity Codes** | NAF/APE code, activity description |
| **Contact** | Phone, email, website URL |
| **Corporate Actions** | Mergers, acquisitions, transfers, dissolutions |

### How to scrape Pappers.fr with the Pappers.fr Company Scraper

Using this Actor is straightforward — no coding required.

1. **Open the Actor** on Apify Console
2. **Enter your search criteria** — either paste one or more Pappers.fr search URLs, type plain text queries, or both
3. **Configure options** — adjust search precision and choose which bases to search (companies, directors, publications)
4. **Set your result limit** — choose how many results to collect per query (or select unlimited)
5. **Click Run** — the Actor fetches all matching company profiles automatically
6. **Download your data** — export results as JSON, CSV, Excel, or HTML from the Apify dataset

#### Pro tip 💡

For the most targeted searches, use the Pappers.fr website to build your filters visually — select a department, an industry (NAF code), legal status, and more — then copy the resulting URL and paste it into the **Search URLs** field. The Actor preserves all filter parameters from the URL.

### Input

The Actor accepts two complementary input modes:

- **Search URLs** — Paste URLs copied directly from pappers.fr/recherche. Every filter parameter (departement, commune, code_naf, etat, etc.) is automatically included in the API request.
- **Search Queries** — Plain text keywords to search across company names, directors, and publications.
- **Max Results** — Set a limit per query/URL, or leave at 0 for unlimited results.
- **Precision** — Standard (broader matches) or Strict (exact matches only).
- **Bases** — Choose which sections to search: entreprises, dirigeants, publications.

You can use one mode or both in the same run.

### Output

Results are stored in Apify's default dataset. Each record is a full company profile from Pappers.fr. Here's a simplified example:

```json
{
  "siren": "503826968",
  "nom_entreprise": "AMD RENTAL",
  "denomination": "AMD RENTAL",
  "forme_juridique": "SARL",
  "forme_juridique_code": "5499",
  "capital": 10000,
  "date_creation": "2008-01-15",
  "date_radiation": null,
  "etat_administratif": "A",
  "etat_administratif_libelle": "Active",
  "code_naf": "77.11Z",
  "libelle_activite": "Location et location-bail d'automobiles et d'autres véhicules automobiles légers",
  "adresse": {
    "ligne": "12 Rue de la Paix",
    "code_postal": "75002",
    "ville": "Paris",
    "pays": "France"
  },
  "dirigeants": [
    {
      "nom": "Dupont",
      "prenom": "Jean",
      "fonction": "Gérant"
    }
  ],
  "etablissements": [
    {
      "siret": "50382696800015",
      "adresse": "12 Rue de la Paix, 75002 Paris",
      "actif": true
    }
  ]
}
````

Results can be exported in **JSON, CSV, Excel, or HTML** formats directly from the Apify platform.

### Is it legal to scrape Pappers.fr?

The Pappers.fr Company Scraper uses **Pappers.fr's official public API** to retrieve data. The data extracted is **publicly available company information** that is already accessible through Pappers.fr's search interface. All data is sourced from French official registries (INSEE, Registre du Commerce et des Sociétés) and is legally diffusable.

This Actor does not access private or non-diffusable information. It is designed to collect only the **publicly listed company data** that Pappers.fr makes available through its API.

As with any data collection tool, you should review Pappers.fr's terms of service and ensure your use case complies with applicable regulations, including the French data protection framework.

### Why use the Pappers.fr Company Scraper instead of the Pappers.fr website?

- **Bulk extraction** — Collect hundreds or thousands of company profiles without manual copy-pasting
- **Structured data** — Get clean JSON records ready for analysis, databases, or spreadsheets
- **Scheduled runs** — Use Apify's scheduling to monitor companies or directories on a recurring basis
- **API access** — Trigger runs programmatically via Apify's REST API and integrate with your workflows
- **Multiple queries** — Run many searches in one go with merged results
- **Unlimited pagination** — Automatically fetch all results beyond what a single page shows

### FAQ

#### Can I search for companies in a specific French city or department?

Yes. Build your search on pappers.fr/recherche with the desired filters (city, department, NAF code, etc.), copy the URL, and paste it into the **Search URLs** field. All filters are preserved.

#### Does this work for individual entrepreneurs (auto-entrepreneurs)?

Yes. The Pappers.fr API returns both legal entities (SARL, SAS, SA, etc.) and individual entrepreneurs identified by their last name, first name, and personal details.

#### Can I get financial data for French companies?

Yes. When available from Pappers.fr, the Actor returns financial data including revenue, profit, employee headcount, and financial year information for each company.

#### How many results can I get?

You can set a custom limit per query or choose unlimited results. The Actor automatically paginates through all available results using Pappers.fr's API.

#### What is a SIREN number?

A SIREN is a 9-digit unique identifier assigned to every French company by INSEE. It is the primary key for French company data and is included in every result from this Actor.

### Related Actors

Explore more data extraction tools on the Apify Store to complement your data workflows.

### Support

If you encounter any issues, have feature requests, or need help with your specific use case, reach out through the Apify platform's contact channels. Feedback helps make this Actor better for everyone.

# Actor input Schema

## `searchUrls` (type: `array`):

One or more Pappers.fr search URLs. Build your filters on pappers.fr/recherche, copy the URL, and paste it here. All URL filter parameters (departement, commune, code\_naf, etat, etc.) are passed directly to the API.

The maxResults setting applies to each URL independently, just like queries.

URLs are processed first, then queries below.

## `queries` (type: `array`):

One or more search terms to query Pappers.fr. Each query will be searched independently and results merged. Uses the precision and bases settings below.

## `maxResults` (type: `integer`):

Maximum results to collect per query or per URL. Set to 0 for unlimited results.

Pagination is automatic:

- 0 (unlimited): cursor-based pagination (slower but no hard limit)
- 1–400: page-based pagination (faster)
- 400+: cursor-based pagination

## `precision` (type: `string`):

Search precision mode (used for queries, not URLs — URLs pass their own parameters).

## `bases` (type: `string`):

Comma-separated list of bases to search in (entreprises, dirigeants, publications). Used for queries, not URLs.

## Actor input object example

```json
{
  "searchUrls": [],
  "queries": [
    "hello"
  ],
  "maxResults": 0,
  "precision": "standard",
  "bases": "entreprises,dirigeants,publications"
}
```

# Actor output Schema

## `companies` (type: `string`):

Full company profile data from Pappers.fr, including SIREN, legal form, address, financials, directors, and more. Each item is one company result matching the search criteria.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchUrls": [],
    "queries": [
        "hello"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("epicscrapers/pappers-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchUrls": [],
    "queries": ["hello"],
}

# Run the Actor and wait for it to finish
run = client.actor("epicscrapers/pappers-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchUrls": [],
  "queries": [
    "hello"
  ]
}' |
apify call epicscrapers/pappers-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=epicscrapers/pappers-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Pappers.fr Company Scraper",
        "description": "Scrape French company data from Pappers.fr including SIREN numbers, legal forms, financial records, directors, and publications. Supports bulk searches by keyword or Pappers.fr URL filters for precise, targeted data extraction.",
        "version": "0.0",
        "x-build-id": "35Z0gbuKmOsqq3fxe"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/epicscrapers~pappers-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-epicscrapers-pappers-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/epicscrapers~pappers-scraper/runs": {
            "post": {
                "operationId": "runs-sync-epicscrapers-pappers-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/epicscrapers~pappers-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-epicscrapers-pappers-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchUrls": {
                        "title": "Search URLs (from pappers.fr)",
                        "type": "array",
                        "description": "One or more Pappers.fr search URLs. Build your filters on pappers.fr/recherche, copy the URL, and paste it here. All URL filter parameters (departement, commune, code_naf, etat, etc.) are passed directly to the API.\n\nThe maxResults setting applies to each URL independently, just like queries.\n\nURLs are processed first, then queries below.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "queries": {
                        "title": "Search Queries",
                        "type": "array",
                        "description": "One or more search terms to query Pappers.fr. Each query will be searched independently and results merged. Uses the precision and bases settings below.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxResults": {
                        "title": "Max Results Per Query/URL",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum results to collect per query or per URL. Set to 0 for unlimited results.\n\nPagination is automatic:\n- 0 (unlimited): cursor-based pagination (slower but no hard limit)\n- 1–400: page-based pagination (faster)\n- 400+: cursor-based pagination",
                        "default": 0
                    },
                    "precision": {
                        "title": "Search Precision",
                        "enum": [
                            "standard",
                            "strict"
                        ],
                        "type": "string",
                        "description": "Search precision mode (used for queries, not URLs — URLs pass their own parameters).",
                        "default": "standard"
                    },
                    "bases": {
                        "title": "Search Bases",
                        "type": "string",
                        "description": "Comma-separated list of bases to search in (entreprises, dirigeants, publications). Used for queries, not URLs.",
                        "default": "entreprises,dirigeants,publications"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
