# Pages Jaunes Scraper — French Business Leads (`actose/pages-jaunes-scrapper`) Actor

Extract business leads from PagesJaunes.fr (French Yellow Pages). Get names, addresses, postal codes, ratings, categories & more perfect for B2B prospecting in France. Pay only $0.99 per 1000 leads. 1000 free results to try. 40+ professions, all French cities.

- **URL**: https://apify.com/actose/pages-jaunes-scrapper.md
- **Developed by:** [Actose](https://apify.com/actose) (community)
- **Categories:** Lead generation, Automation, E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.99 / 1,000 business lead extracteds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Pages Jaunes Scraper — Extract French Business Leads from PagesJaunes.fr

> 💰 **$0.99 per 1,000 leads** — Pay only for the data you actually extract. Cancel anytime.

Extract thousands of **French business leads** from PagesJaunes.fr — the #1 French Yellow Pages directory — in minutes. Build prospection lists, enrich your CRM, research competitors, or power your lead-gen pipeline with clean, structured business data.

Works across **all of France**: Paris, Lyon, Marseille, Toulouse, Bordeaux, Lille, Nantes, and 30,000+ cities.

---

### ⚡ Why this scraper?

- ✅ **Pay-per-result pricing** — you only pay when you get actual leads ($0.99 / 1000)
- ✅ **No setup** — enter keywords + cities, click Start, get a CSV
- ✅ **Structured address** — street, postal code, city, department split into separate fields (ready for CRM import)
- ✅ **Category codes** — unique `codeRubrique` field for strict category filtering (no other PJ scraper does this)
- ✅ **Ratings included** — both Pages Jaunes and Google ratings when available
- ✅ **Fresh data** — every run pulls live from PagesJaunes, no stale cache
- ✅ **Fault-tolerant** — automatic fallback to text search when SEO URLs are empty

---

### 🚀 Quick start

#### Input

```json
{
  "keywords": ["plombier", "electricien", "boulangerie"],
  "locations": ["Paris", "Lyon", "Marseille"],
  "maxResultsPerSearch": 200,
  "maxResults": 2000,
  "maxConcurrency": 5
}
````

This will scrape up to **200 leads per (keyword × location)** pair — that's 9 combinations × 200 = up to 1,800 leads, capped globally at 2,000.

#### Output (per lead)

| Field | Example |
|---|---|
| `name` | `Boulangerie Emmanuel Martin` |
| `address` | `18 rue Lourmel 75015 Paris` |
| `street` | `18 rue Lourmel` |
| `postalCode` | `75015` |
| `city` | `Paris` |
| `department` | `75` |
| `description` | `Notre boulangerie vous accueille...` |
| `tags` | `["pain au levain", "pâtisserie sur commande", ...]` |
| `rating` | `4.5` |
| `reviewsCount` | `27` |
| `googleRating` | `4.7` |
| `googleReviews` | `134` |
| `codeRubrique` | `102140` |
| `numClient` | `03796328` |
| `detailUrl` | `https://www.pagesjaunes.fr/pros/03796328` |

Exports as **JSON, CSV, XLSX, HTML, XML** directly from the Apify dataset.

***

### 💡 Best practices

#### Use precise keywords for cleaner results

✅ Good keywords (return actual businesses):

- `boulangerie`, `plombier`, `electricien`, `chauffagiste`, `serrurier`
- `avocat`, `notaire`, `medecin-generaliste`, `dentiste`
- `restaurant`, `coiffeur`, `garage-automobile`, `agence-immobiliere`

❌ Avoid single-word ambiguous keywords like `boulanger` (matches the electronics chain "Boulanger" and people named Boulanger).

#### Locations

Cities, arrondissements, departments, and regions all work:

- `Paris`, `Lyon`, `Marseille 7e`
- `75`, `69`, `13` (department codes)
- `Ile-de-France`, `Provence-Alpes-Cote-d-Azur`

#### Filter by strict category (advanced)

Every lead includes a `codeRubrique` field when available (official Pages Jaunes category code). Use it to filter in Excel/Python/SQL after extraction:

| Code | Category |
|---|---|
| `102140` | Boulangerie-Pâtisserie |
| `629620` | Plombier |
| `304040` | Électricien |
| `167560` | Chauffagiste |
| `722480` | Serrurier |
| `490040` | Maçon |
| `598270` | Peintre en bâtiment |
| `521410` | Menuisier |
| `048380` | Couvreur |
| `199080` | Climatisation |
| `518370` | Médecin généraliste |
| `850158` | Plomberie-dépannage |

***

### 💰 Pricing

**$0.99 per 1,000 leads extracted.** That's it. No monthly fee, no hidden cost.

| Volume | Cost |
|---|---|
| 1,000 leads | $0.99 |
| 5,000 leads | $4.95 |
| 10,000 leads | $9.90 |
| 50,000 leads | $49.50 |
| 100,000 leads | $99.00 |

Test with free Apify platform credits before committing.

***

### ❓ FAQ

**Does this include phone numbers and emails?**
This current version extracts all data visible on Pages Jaunes search result pages (name, address, ratings, tags, description, category code). Phone numbers are loaded via AJAX on individual profile pages and require a separate premium extraction — planned for a future version.

**How many results per city?**
Pages Jaunes shows up to ~1,000 results per (keyword × city) combination. You control the cap with `maxResultsPerSearch`.

**What if my keyword returns zero results?**
The scraper has a built-in fallback: if the direct SEO URL returns empty, it automatically switches to Pages Jaunes' text search engine to recover results.

**Are proxies included?**
Yes — the scraper uses Apify's residential proxy network (France) to avoid blocks. Proxy costs are included in the $0.99/1000 pricing.

**Is this legal / GDPR-compliant?**
This scraper collects publicly available business data (not personal data under GDPR). You remain responsible for GDPR compliance when using the data for outreach — we recommend respecting opt-out requests and honoring the CNIL's B2B prospection guidelines.

**Can you build a custom scraper for another directory?**
Yes — we build similar scrapers for PagineGialle (Italy), Gouden Gids (Belgium), Páginas Amarillas (Spain), Yellow Pages (UK/US), and others. Contact us through Apify messaging.

***

### 🔧 Technical notes

- Runtime: ~2-5 seconds per result (varies with proxy)
- Deduplication: automatic across all searches (same `numClient` = one result)
- Error handling: 3 automatic retries with session rotation on DataDome blocks
- Concurrency: up to 10 parallel workers (default 5, configurable)

***

### 📬 Built by Actose

**Actose** builds reliable, affordable scrapers for European business directories. More scrapers coming:

- 🇮🇹 PagineGialle Italia
- 🇧🇪 Gouden Gids Belgium
- 🇪🇸 Páginas Amarillas España
- 🇳🇱 Gouden Gids Nederland

Follow our Apify profile for updates.

# Actor input Schema

## `keywords` (type: `array`):

Professions or business types to search (ex: plombier, boulangerie, avocat). One entry per line.

## `locations` (type: `array`):

Cities, departments or regions (ex: Paris, Lyon, 75, Île-de-France). One entry per line.

## `maxResultsPerSearch` (type: `integer`):

Maximum number of leads to extract per keyword × location combination.

## `maxResults` (type: `integer`):

Hard cap on the total number of results across all searches. Protects you from unexpected billing.

## `maxConcurrency` (type: `integer`):

Parallel requests. Keep low (1-3) to avoid blocking.

## Actor input object example

```json
{
  "keywords": [
    "plombier",
    "boulangerie"
  ],
  "locations": [
    "Paris",
    "Lyon"
  ],
  "maxResultsPerSearch": 100,
  "maxResults": 1000,
  "maxConcurrency": 2
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keywords": [
        "plombier",
        "boulangerie"
    ],
    "locations": [
        "Paris",
        "Lyon"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("actose/pages-jaunes-scrapper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "keywords": [
        "plombier",
        "boulangerie",
    ],
    "locations": [
        "Paris",
        "Lyon",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("actose/pages-jaunes-scrapper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keywords": [
    "plombier",
    "boulangerie"
  ],
  "locations": [
    "Paris",
    "Lyon"
  ]
}' |
apify call actose/pages-jaunes-scrapper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=actose/pages-jaunes-scrapper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Pages Jaunes Scraper — French Business Leads",
        "description": "Extract business leads from PagesJaunes.fr (French Yellow Pages). Get names, addresses, postal codes, ratings, categories & more perfect for B2B prospecting in France. Pay only $0.99 per 1000 leads. 1000 free results to try. 40+ professions, all French cities.",
        "version": "0.0",
        "x-build-id": "okoXOROS3N5ELM2Lh"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/actose~pages-jaunes-scrapper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-actose-pages-jaunes-scrapper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/actose~pages-jaunes-scrapper/runs": {
            "post": {
                "operationId": "runs-sync-actose-pages-jaunes-scrapper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/actose~pages-jaunes-scrapper/run-sync": {
            "post": {
                "operationId": "run-sync-actose-pages-jaunes-scrapper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "keywords",
                    "locations"
                ],
                "properties": {
                    "keywords": {
                        "title": "Keywords / Mots-clés",
                        "type": "array",
                        "description": "Professions or business types to search (ex: plombier, boulangerie, avocat). One entry per line.",
                        "default": [
                            "plombier"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "locations": {
                        "title": "Locations / Localités",
                        "type": "array",
                        "description": "Cities, departments or regions (ex: Paris, Lyon, 75, Île-de-France). One entry per line.",
                        "default": [
                            "Paris"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxResultsPerSearch": {
                        "title": "Max results per search",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Maximum number of leads to extract per keyword × location combination.",
                        "default": 100
                    },
                    "maxResults": {
                        "title": "Max results (total safety cap)",
                        "minimum": 1,
                        "maximum": 100000,
                        "type": "integer",
                        "description": "Hard cap on the total number of results across all searches. Protects you from unexpected billing.",
                        "default": 1000
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Parallel requests. Keep low (1-3) to avoid blocking.",
                        "default": 2
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
