Pagesjaunes Reviews Scraper
Pricing
from $1.00 / 1,000 record scrapeds
Pagesjaunes Reviews Scraper
Scrapes reviews from pagesjaunes.fr with full pagination, date filtering, review cap, and residential proxy support.
Pricing
from $1.00 / 1,000 record scrapeds
Rating
0.0
(0)
Developer
Reviewly
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
PagesJaunes Reviews Scraper — Extract Customer Reviews at Scale
Turn PagesJaunes customer reviews into structured data in minutes — no coding required.
Monitor your brand reputation, research competitors, generate qualified leads, or feed review data into your platform. This Apify Actor scrapes all public reviews from any PagesJaunes business page and delivers clean, structured JSON — automatically.
- ✅ Scrapes all reviews with full pagination
- ✅ Filter by date to get only recent reviews
- ✅ Cap the number of reviews per company
- ✅ Built-in French residential proxies — no blocks
- ✅ Handles multiple companies in a single run
📌 What This Actor Does
PagesJaunes is France's largest local business directory, with millions of verified customer reviews across every industry. This Actor lets you extract those reviews programmatically — without writing a single line of code.
Give it one or more PagesJaunes business URLs, configure your filters, and it returns a structured dataset containing company information and all matching reviews, ready for export to JSON, CSV, or any downstream tool.
Who is this for?
- Marketers & agencies tracking brand reputation for French clients
- Businesses monitoring what customers say about them or their competitors
- Developers building review aggregation platforms or dashboards
- Sales teams identifying leads based on competitor review patterns
- Researchers & analysts studying consumer sentiment in the French market
✨ Key Features
- Full pagination — scrapes every review page automatically, not just the first 25
- Date filtering — set a
targetDateand only reviews published on or after that date are collected - Review cap — set
maxNumberOfReviewsto limit how many reviews are collected per company (useful for large-scale runs) - Rich review data — captures rating, review body, author, location, experience date, and business responses
- Rating breakdown — extracts the 1–5 star distribution for each company
- Response detection — captures the business's written reply to each review, including the response date
- Platform attribution — each review includes a
platformfield set to"custplace"or"pagesjaunes"so you know the origin of each review - Reliable retries — exponential backoff with up to 10 retries per request, so transient errors don't stop your run
- Anti-block design — uses residential French proxies and Chrome browser fingerprinting to stay undetected
🧠 Why This Actor Is Different
Most PagesJaunes scrapers break after the first page or get blocked within minutes. This Actor was built for production use:
| Feature | This Actor | Typical scrapers |
|---|---|---|
| Full pagination | ✅ All pages | ❌ First page only |
| Proxy type | ✅ Residential FR | ⚠️ Datacenter (easily blocked) |
| Browser fingerprint | ✅ Chrome via impit | ❌ Plain HTTP |
| Retry logic | ✅ Exponential backoff (10×) | ❌ Fails on first error |
| Date filtering | ✅ Built-in | ❌ Manual post-processing |
| Business responses | ✅ Captured | ❌ Usually missed |
| Rating breakdown | ✅ Per star level | ❌ Aggregate only |
⚙️ Input Configuration
Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
startUrls | array | ✅ Yes | — | One or more PagesJaunes company page URLs |
maxNumberOfReviews | integer | No | 0 | Max reviews per company. 0 = no limit |
targetDate | string | No | — | Only collect reviews on or after this date (YYYY-MM-DD) |
proxyConfiguration | object | No | Residential FR | Proxy settings (see below) |
Example Input
{"startUrls": [{ "url": "https://www.pagesjaunes.fr/pros/53876558" },{ "url": "https://www.pagesjaunes.fr/pros/61558803" }],"maxNumberOfReviews": 500,"targetDate": "2024-01-01","proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "FR"}}
Input Tips
- Finding the URL: navigate to any company page on pagesjaunes.fr and copy the URL from your browser. It will look like
https://www.pagesjaunes.fr/pros/XXXXXXXX. targetDate: use this when you only need recent reviews — for example, set it to 30 days ago for a monthly monitoring workflow. This reduces cost and runtime significantly.maxNumberOfReviews: leave it at0to scrape all reviews. Set a number (e.g.100) if you only need a sample or want to control cost.- Proxy: the default residential French proxy is strongly recommended. PagesJaunes uses geo-based bot detection and datacenter IPs are frequently blocked.
📤 Output Format
Each dataset item corresponds to one company URL. It contains two top-level keys: entity (company metadata) and reviews (array of review objects).
Sample Output
{"entity": {"keyword": "Plombier Martin - Paris 11","totalRating": 4.7,"totalReviews": 312,"ratingDetails": {"1": 5,"2": 8,"3": 14,"4": 42,"5": 243},"url": "https://www.pagesjaunes.fr/pros/53876558"},"reviews": [{"reviewId": "Avis-12345678","url": "https://www.pagesjaunes.fr/pros/53876558#Avis-12345678","location": "Paris (75011)","reviewBody": "Intervention rapide et travail soigné. Je recommande vivement.","reviewRating": 5,"reviewDate": "2024-11-15T00:00:00.000Z","experienceDate": "2024-11-10T00:00:00.000Z","author": {"userName": "Sophie M."},"response": {"reviewResponse": "Merci beaucoup pour votre retour, Sophie ! C'est un plaisir de vous avoir aidé.","responseDate": "2024-11-16T00:00:00.000Z"},"platform": "pagesjaunes"}]}
Field Reference
Entity fields
| Field | Type | Description |
|---|---|---|
keyword | string | Company name as displayed on PagesJaunes |
totalRating | number | Overall average rating (1.0–5.0) |
totalReviews | integer | Total number of reviews on the platform |
ratingDetails | object | Count of reviews per star level (1–5) |
url | string | The input URL that was scraped |
Review fields
| Field | Type | Description |
|---|---|---|
reviewId | string | Unique review identifier (e.g. Avis-12345678) |
url | string | Direct link to the review on PagesJaunes |
location | string | Reviewer's city/region (if provided) |
reviewBody | string | Full text of the review |
reviewRating | number | Star rating given (1–5) |
reviewDate | string (ISO 8601) | Date the review was published |
experienceDate | string (ISO 8601) | Date of the customer's experience (if provided) |
author.userName | string | Display name of the reviewer |
response.reviewResponse | string | Business's written reply (if any) |
response.responseDate | string (ISO 8601) | Date the business replied |
platform | string | Origin of the review: "custplace" or "pagesjaunes" |
▶️ How to Use
Option 1 — Apify Console (No Code)
- Open the Actor page in the Apify Store
- Click Try for free
- In the Input tab, paste one or more PagesJaunes URLs into the Start URLs field
- (Optional) Set a Target Date and/or Max Reviews
- Click Start — the run will begin immediately
- When finished, go to the Dataset tab to preview or download results as JSON or CSV
Option 2 — Apify API
curl -X POST \"https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"startUrls": [{ "url": "https://www.pagesjaunes.fr/pros/53876558" }],"targetDate": "2024-06-01","maxNumberOfReviews": 200}'
Option 3 — Apify SDK (JavaScript)
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('YOUR_ACTOR_ID').call({startUrls: [{ url: 'https://www.pagesjaunes.fr/pros/53876558' }],targetDate: '2024-06-01',maxNumberOfReviews: 200,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
📈 Use Cases
1. Brand Reputation Monitoring
Schedule the Actor to run weekly with a targetDate set to 7 days ago. Get only new reviews delivered to your dataset automatically. Feed results into Slack, email, or a dashboard to stay on top of what customers are saying.
2. Competitor Analysis
Collect reviews from competitors in your industry. Compare average ratings, star distributions, and recurring themes in review text to identify gaps in their service — and opportunities for yours.
3. Lead Generation
Find businesses with declining ratings or a high proportion of negative reviews. These companies often need reputation management services, review response tools, or operational improvements. Export results to a CRM for targeted outreach.
4. Review Aggregation Platform
If you're building a review aggregation or comparison tool for the French market, use this Actor as your PagesJaunes data source. Results are clean JSON, ready to store in any database.
5. Customer Sentiment Research
Analyse review bodies at scale using NLP or AI tools to extract recurring topics, sentiment trends, or emerging issues across an industry vertical in France.
🛠️ Advanced Tips
Running Multiple Companies Efficiently
Pass all your URLs in a single startUrls array. The Actor processes them sequentially with a fresh proxy per company, which is more reliable than running separate Actor instances.
Incremental Scraping (Monitoring Workflows)
Use targetDate set to your last run date to fetch only new reviews on each run. This keeps costs low and avoids re-processing data you already have. Combine with Apify Scheduler for fully automated monitoring.
Controlling Cost with maxNumberOfReviews
For companies with thousands of reviews, set maxNumberOfReviews to a reasonable cap (e.g. 500) if you don't need the entire history. Reviews are returned newest-first, so you'll always get the most recent ones.
Exporting to Google Sheets or CSV
In the Apify Console, after a run completes, click Export in the Dataset tab and choose CSV. You can also connect directly to Google Sheets using Apify's Google Sheets integration.
Proxy Country
The default proxy country is FR (France). PagesJaunes serves content based on geography, so keeping the country set to France gives you accurate data and avoids geo-based redirects.
❓ FAQ & Troubleshooting
Q: The Actor finishes but the dataset is empty.
Make sure the URL you provided is a valid PagesJaunes company page in the format https://www.pagesjaunes.fr/pros/XXXXXXXX. Category pages or search result pages will be skipped. Check the Actor log for a "URL not found" message.
Q: Can I scrape any PagesJaunes URL?
Currently the Actor supports company profile pages (/pros/...). Category search pages or map views are not supported.
Q: How many reviews can it scrape?
There is no hard limit. The Actor will paginate through all available reviews unless you set maxNumberOfReviews. Runs with thousands of reviews may take several minutes.
Q: Will PagesJaunes block the scraper? The Actor uses residential French proxies and Chrome browser fingerprinting, which makes requests indistinguishable from a real browser. Exponential backoff retries handle any transient blocks automatically.
Q: How fresh is the data? Every run fetches live data directly from PagesJaunes — there is no caching. You get the current state of reviews at the time you run the Actor.
Q: What does the platform field mean?
Some reviews on PagesJaunes were collected via Custplace, a verified reviews platform. The platform field will be "custplace" for those reviews and "pagesjaunes" for organically written ones, so you can distinguish between verified post-transaction reviews and direct reviews.
Q: The experienceDate is null for some reviews.
PagesJaunes only shows the experience date when the reviewer explicitly provides it. When it's absent, the field will be null.
Q: Can I get notified when new reviews appear?
Yes — use Apify Scheduler to run the Actor on a regular interval with targetDate set to your previous run date. Combine with a webhook to push new results to Slack, email, or any webhook endpoint.
📞 Support
For questions, bug reports, or feature requests:
- 📧 Email: me@ahmedhrid.com
- 🐛 GitHub Issues: open an issue on this repository
- 💬 Apify Community: community.apify.com