Orpi Scraper
Pricing
$5.00 / actor start
Orpi Scraper
Extract comprehensive data from Orpi agencies including agency information and advisor details. Scrapes all agencies from the sitemap and extracts advisor data (names, functions, phones, photos, agentIds) from HTML. Optimized for fast extraction with retry mechanism and batch processing.
Pricing
$5.00 / actor start
Rating
0.0
(0)
Developer

Corentin Robert
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
11 days ago
Last modified
Categories
Share
Orpi Scraper - Extract Agency and Advisor Data
Last updated: January 2026
🎯 Why use this scraper?
Extract complete Orpi agency and advisor data in minutes with automated scraping across all 1300+ agencies in France. Perfect for real estate professionals, lead generation companies, and market research teams.
✅ What you get
Complete data including:
- Agency Information: Name, address, postal code, city, region, department, phone, email, website, company details, legal information
- Advisor Profiles: Full name, first name, last name, function, phone number, photo URL, agent ID
- Normalized Data: Proper case formatting, formatted phone numbers, complete addresses, numeric capital values
- Metadata: Scraping timestamp, agent count per agency
🚀 Key Features
The Orpi Scraper extracts comprehensive data from the Orpi network, covering all agencies across France. The Actor automatically:
- Parses the sitemap to discover all agency URLs from the official Orpi sitemap (
https://www.orpi.com/sitemap-agences.xml) - Extracts agency information including name, address, postal code, city, phone, email, and website
- Extracts advisor information including full name, function, phone number, photo, and agent ID
- Uses optimized extraction with HTML parsing for maximum reliability
- Exports structured data ready for analysis, CRM integration, or business intelligence
The Actor processes all agencies from the sitemap and extracts advisor data from each agency's team section using HTML parsing.
What can this Orpi Scraper do?
📊 Complete Data Extraction
- ✅ 1300+ Agencies: Complete network coverage from official sitemap
- ✅ Dual-Level Data: Both agency and advisor information in one run
- ✅ Normalized Format: Proper case names, formatted phones, clean addresses
- ✅ Legal Information: Company name, RCS, VAT, professional card, CCI
- ✅ Contact Details: Phone numbers, emails, addresses, postal codes
- ✅ Advisor Profiles: Names, functions, photos, agent IDs
⚡ Performance & Reliability
- ✅ Fast Processing: 15 agencies in parallel (configurable up to 50)
- ✅ Automatic Retries: Built-in retry mechanism for failed requests
- ✅ Error Handling: Graceful error handling that continues processing
- ✅ Progressive Output: Data saved incrementally (CSV + Apify dataset)
🎯 Platform Advantages
Your Actor + the Apify platform. They come as a package. This scraper benefits from:
- Monitoring & Logs: Real-time execution monitoring with detailed logs
- API Access: Access your data programmatically via Apify API
- Scheduling: Set up automated runs on a schedule
- Integrations: Connect to Make.com, Zapier, Google Sheets, and more
- Proxy Rotation: Automatic proxy management for reliable scraping
- Scalability: Handle large-scale scraping with cloud infrastructure
- Data Storage: Secure dataset storage with multiple export formats (JSON, CSV, Excel, HTML)
💼 Use Cases and Client Benefits
🏢 For Real Estate Lead Generation Companies
The Problem: Manually collecting contact information for 1300+ Orpi agencies and thousands of advisors is extremely time-consuming and error-prone.
The Solution: Automate the entire data collection process with a single Actor run that extracts all agency and advisor data in minutes.
Client Benefits:
- 📊 Complete Database: Access to all Orpi agencies and advisors in one dataset
- 💰 Time Savings: Reduce data collection time from weeks to minutes
- 🎯 Accurate Data: Normalized and validated contact information
- 📈 Scalable: Easy to update regularly with scheduled runs
ROI: Save 40+ hours of manual work per data collection cycle. Update your database monthly in minutes instead of weeks.
🏠 For Real Estate Market Research
The Problem: Understanding market coverage and advisor distribution across France requires manual research across hundreds of agency pages.
The Solution: Get comprehensive geographic and organizational data automatically extracted and ready for analysis.
Client Benefits:
- 📍 Geographic Analysis: Complete coverage by region and department
- 👥 Team Analysis: Advisor count and function distribution per agency
- 📊 Market Insights: Identify coverage gaps and opportunities
- 🔄 Regular Updates: Keep data fresh with automated scheduling
ROI: Transform weeks of research into actionable insights in minutes.
📈 Concrete Results: Before vs. After
Before (without the scraper)
- ⏱️ Time required: 2-3 weeks of manual data collection
- ❌ Errors: High risk of typos and missing data
- 📉 Coverage: Limited to a few agencies due to time constraints
- 🔄 Updates: Manual process must be repeated for each update
After (with the scraper)
- ⚡ Time saved: Complete extraction in 10-15 minutes
- ✅ Accuracy: Automated extraction eliminates human errors
- 📊 Coverage: All 1300+ agencies and thousands of advisors
- 🔄 Automation: Schedule regular updates with zero manual work
Time saved: 95% reduction (weeks → minutes)
Speed: 1-2 agencies per second with default settings
Quality improvement: 100% coverage vs. selective manual collection
💰 Costs and Optimization
💵 Actual Cost Breakdown
| Service | Usage | Cost |
|---|---|---|
| Actor compute | ~15 minutes for full run | ~$0.10-0.20 |
| Dataset writes | ~7000 records | ~$0.01-0.02 |
| Total | ~$0.11-0.22 |
Note: Costs are approximate and depend on Apify pricing plan. The Actor is optimized with Cheerio (no browser overhead) for minimal resource consumption.
💡 Optimization Tips
- Use
maxAgencies: 10for testing (default for daily tests) - Adjust
maxConcurrencybased on your needs (15 default, up to 50) - Filter by department or function to reduce processing time
- Schedule runs during off-peak hours for better performance
📋 Complete Data Fields Extracted
| Category | Field Name | Description | Example |
|---|---|---|---|
| Agency | agencyName | Official agency name | "Orpi 101 Jaures" |
| Agency | agencyCompanyName | Legal company name | "MV JAURES IMMO SARL" |
| Agency | agencyFullAddress | Complete formatted address | "101 Rue Jean Jaurès, 29200 Brest" |
| Agency | agencyAddress | Street address | "101 Rue Jean Jaurès" |
| Agency | agencyPostalCode | 5-digit postal code | "29200" |
| Agency | agencyCity | City name (proper case) | "Brest" |
| Agency | agencyRegion | Region name (proper case) | "Bretagne" |
| Agency | agencyDepartment | Department name (proper case) | "Finistère" |
| Agency | agencyCountry | Country | "France" |
| Agency | agencyPhone | Raw phone number | "0298434656" |
| Agency | agencyPhoneFormatted | Formatted phone | "+33 2 98 43 46 56" |
| Agency | agencyEmail | Email address | "agencejaures@orpi.com" |
| Agency | agencyWebsite | Agency URL | "https://www.orpi.com/agencejaures/" |
| Agency | agencyCapital | Capital with symbol | "27000 €" |
| Agency | agencyCapitalNumeric | Capital numeric | "27000" |
| Agency | agencyRCS | RCS number | "880740410" |
| Agency | agencyProfessionalCard | Professional card | "CPI 2901 2020 000 044 681 T/G" |
| Agency | agencyCCI | Chamber of Commerce | "CCI de Bretagne Ouest" |
| Agency | agencyPriceScaleUrl | Price scale PDF URL | "https://static.orpi.com/files/baremes/..." |
| Agency | agencyAgentsCount | Number of advisors | 6 |
| Advisor | advisorName | Full name (proper case) | "Marc Gonzalez" |
| Advisor | advisorFirstName | First name | "Marc" |
| Advisor | advisorLastName | Last name | "Gonzalez" |
| Advisor | advisorFunction | Job title (cleaned) | "Gérant" |
| Advisor | advisorPhone | Raw phone number | "0764579335" |
| Advisor | advisorPhoneFormatted | Formatted phone | "+33 7 64 57 93 35" |
| Advisor | advisorEmail | Estimated email (pattern: first letter + last name) | "mborgne@orpi.com" ⚠️ Assumption, not verified |
| Advisor | advisorPhoto | Photo URL | "https://..." |
| Advisor | advisorAgentId | Unique agent ID | "649342" |
| Metadata | scrapedAt | ISO timestamp | "2026-01-15T10:30:00.000Z" |
💡 How to Use the Data
📊 Business Intelligence
- Analyze market coverage by region and department
- Identify agencies with the most advisors
- Track advisor function distribution
- Monitor market changes over time
📧 Lead Generation
- Build targeted contact lists by location
- Filter advisors by function (Gérant, Directeur, etc.)
- Export to CRM systems (Salesforce, HubSpot, etc.)
- Create personalized outreach campaigns
🔍 Market Research
- Map agency distribution across France
- Analyze organizational structure
- Identify coverage gaps
- Competitive intelligence
📈 Data Integration
- Import into Google Sheets or Excel
- Connect to Make.com or Zapier workflows
- Feed into BI tools (Tableau, Power BI)
- API integration for custom applications
What data can Orpi Scraper extract?
The Actor extracts comprehensive data from Orpi agencies and advisors. Here's what you can extract:
| Data Category | Fields Extracted | Description |
|---|---|---|
| Agency Information | agencyName, agencySlug, agencyAddress, agencyPostalCode, agencyCity, agencyPhone, agencyEmail, agencyWebsite, agencyRegion, agencyDepartment, agencyAgentsCount | Complete agency contact details, location, and advisor count |
| Advisor Information | advisorName, advisorFirstName, advisorLastName, advisorFunction, advisorPhone, advisorEmail, advisorPhoto, advisorAgentId | Full advisor profile with contact information. Note: advisorEmail is constructed based on observed pattern (first letter of first name + last name) and is not verified |
Detailed Field Description
Agency Fields:
agencyName: Official name of the real estate agency (e.g., "Orpi 101 Jaures")agencySlug: URL-friendly identifier (e.g., "agencejaures")agencyAddress: Street address of the agencyagencyPostalCode: French postal code (5 digits)agencyCity: City nameagencyPhone: Agency phone numberagencyEmail: Agency email addressagencyWebsite: Full URL of the agency pageagencyRegion: Region name (when available)agencyDepartment: Department name (when available)agencyAgentsCount: Total number of advisors in the agency
Advisor Fields:
advisorName: Full name of the advisor (e.g., "MARC GONZALEZ")advisorFirstName: First name onlyadvisorLastName: Last name onlyadvisorFunction: Job title (e.g., "Gérant", "Directeur(ice) d'agence", "Conseiller(e)")advisorPhone: Direct phone number of the advisoradvisorEmail: Estimated email address constructed based on observed Orpi pattern (first letter of first name + last name + @orpi.com). Example: Michel Borgne →mborgne@orpi.com, Véronique Gonzalez →vgonzalez@orpi.com. Note: These are assumptions based on pattern analysis and are not verified.advisorPhoto: URL of the advisor's professional photoadvisorAgentId: Unique identifier of the advisor (from contact link, e.g., "649342")
How to scrape Orpi data?
Step-by-step guide
- Configure your input (optional): Adjust
maxConcurrency(default: 15) andrequestTimeout(default: 12000ms) if needed - Run the Actor: Click "Start" and let the Actor process all agencies from the sitemap
- Monitor progress: Watch real-time logs showing agency processing status
- Download results: Once complete, download your data in JSON, CSV, Excel, or HTML format from the Dataset tab
Input Configuration
Orpi Scraper offers configuration options. Click on the Input tab for an interactive interface.
Default input (for daily testing):
{"sitemapUrl": "https://www.orpi.com/sitemap-agences.xml","maxConcurrency": 15,"requestTimeout": 12000,"maxAgencies": 10,"filterDepartment": "","filterAdvisorFunction": ""}
Input Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
| sitemapUrl | string | https://www.orpi.com/sitemap-agences.xml | URL of the Orpi sitemap |
| maxConcurrency | integer | 15 | Maximum number of concurrent requests (Range: 1-50). Increase for faster scraping, decrease to reduce server load |
| requestTimeout | integer | 12000 | Request timeout in milliseconds (Range: 5000-60000) |
| maxAgencies | integer | 10 | Limit the number of agencies to process. Set to 0 to process all agencies. Default 10 for daily testing |
| filterDepartment | string | "" | Filter agencies by department (e.g., "Finistère", "Paris"). Case-insensitive. Leave empty for all departments |
| filterAdvisorFunction | string | "" | Filter advisors by function (e.g., "Gérant", "Directeur", "Conseiller"). Case-insensitive partial match. Leave empty for all functions |
Output Format
The Actor outputs data in the following format:
Agency and Advisor Record Example
{"agencyName": "Orpi 101 Jaures","agencySlug": "agencejaures","agencyAddress": "","agencyPostalCode": "29200","agencyCity": "Brest","agencyPhone": "","agencyEmail": "","agencyWebsite": "https://www.orpi.com/agencejaures/","agencyRegion": "","agencyDepartment": "","agencyAgentsCount": 6,"advisorName": "MARC GONZALEZ","advisorFirstName": "MARC","advisorLastName": "GONZALEZ","advisorFunction": "Gérant","advisorPhone": "0764579335","advisorPhoneFormatted": "+33 7 64 57 93 35","advisorEmail": "mgonzalez@orpi.com","advisorPhoto": "https://cutjhqvjma.cloudimg.io/https%3A%2F%2Ftelemaque.orpi.coop%2Fcontact%2F35419%2F20231013010205%2Fphoto?p=agency-team-item&ci_url_encoded=1&ci_sign=...","advisorAgentId": "649342"}
CSV Export
When running locally, the Actor automatically generates a CSV file (output.csv) with all extracted data, using semicolons (;) as delimiters for easy import into Excel or other tools.
Usage
Local Development
- Install dependencies:
$npm install
- Configure input in
input.json:
{"sitemapUrl": "https://www.orpi.com/sitemap-agences.xml","maxConcurrency": 15,"requestTimeout": 12000,"maxAgencies": 0}
- Run the scraper:
$npm start
- Results will be saved to:
output.csv- CSV file with all extracted datastorage/datasets/default/- Apify dataset (JSON format)
Apify Platform
- Upload the Actor to Apify
- Configure input parameters in the Actor's input
- Run the Actor
- Download results from the Actor's dataset
Technical Details
Extraction Process
- Sitemap Parsing: The Actor first parses the Orpi sitemap XML to extract all agency URLs
- Agency Page Scraping: For each agency, the Actor visits the agency page to extract agency information and advisor data
- HTML Parsing: Uses Cheerio for efficient HTML parsing and extracts data from structured HTML elements
- Parallel Processing: Processes multiple agencies simultaneously with configurable concurrency for optimal performance
- Data Normalization: All data is normalized and validated before saving
Error Handling
- Automatic retry mechanism for failed requests (up to 1 retry)
- Timeout handling for slow-loading pages
- Graceful error handling that continues processing other agencies
- Fallback extraction methods for missing data fields
Performance Tips
- Adjust Concurrency: Increase
maxConcurrency(up to 50) for faster scraping if the website can handle it - Timeout Settings: Use default timeout (12000ms) for most cases. Increase only if experiencing timeout errors
- Monitor Progress: Check logs regularly to ensure smooth operation
- Test First: Use
maxAgenciesto limit the number of agencies for testing before running the full scrape
Limitations
- The scraper depends on the structure of the Orpi website. If the website structure changes significantly, the Actor may need updates
- Some agencies may not have complete information publicly available
- Advisor Email addresses: The
advisorEmailfield contains estimated emails constructed based on the observed pattern (first letter of first name + last name + @orpi.com). These are assumptions based on pattern analysis and are not verified. Actual email addresses may differ, especially for advisors with special characters, compound names, or non-standard naming conventions - Rate limiting may apply if scraping too aggressively - use default concurrency settings for best results
How much will it cost to scrape Orpi?
The Orpi Scraper uses consumption-based pricing (Compute Units). The cost depends on:
- Number of agencies: Each agency page requires one request (1300+ agencies)
- Concurrency level: Higher concurrency processes more agencies simultaneously but uses more resources
- Request timeouts: Longer timeouts may consume more CUs if pages load slowly
Estimated costs:
- Free plan: Test with a small number of agencies (use
maxAgencies: 10) - Starter plan: Scrape hundreds of agencies efficiently
- Professional plan: Handle the full network (1300+ agencies) with optimal performance
The Actor is optimized to minimize CU consumption by using efficient Cheerio parsing (no browser overhead) and configurable concurrency. Most runs will complete in a few minutes depending on the total number of agencies.
Is it legal to scrape Orpi?
Our scrapers are ethical and do not extract any private user data. They only extract publicly available information that is displayed on the website. We therefore believe that our scrapers, when used for ethical purposes by Apify users, are safe.
However, you should be aware that your results could contain personal data. Personal data is protected by the GDPR in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers.
You can also read our blog post on the legality of web scraping.
FAQ
How many agencies can I scrape?
The Actor scrapes all available agencies from the Orpi sitemap, typically 1300+ agencies across France.
How fast is the scraping?
With default settings (15 agencies in parallel), the Actor typically processes 1-2 agencies per second, completing the full network in approximately 10-15 minutes.
Can I filter agencies by location?
Yes! Use the filterDepartment parameter to filter agencies by department (e.g., "Finistère", "Paris", "Hauts-de-Seine"). The Actor will only process agencies matching the specified department. You can also filter advisors by function using filterAdvisorFunction (e.g., "Gérant", "Directeur", "Conseiller").
What if the scraping fails?
The Actor includes automatic error handling and retry mechanisms. If an agency page fails, the Actor will log the error and continue with other agencies. Check the Actor logs for detailed error information.
Can I schedule regular runs?
Yes! Use Apify's scheduling feature to automatically run the Actor on a schedule (daily, weekly, etc.) to keep your data up-to-date.
How do I access the data via API?
Once the Actor completes, you can access the dataset via the Apify API. Check the API tab on the Actor detail page for code examples in JavaScript and Python.
Can I integrate this with other tools?
Yes! Apify supports integrations with Make.com, Zapier, Google Sheets, and many other platforms. Check the Integrations section in your Apify account.
I need a custom solution
If you need additional features or custom modifications to this Actor, feel free to reach out. We're open to creating custom solutions based on the current one.
📞 Support
For issues, questions, or feedback:
- Email: corentin@outreacher.fr
- LinkedIn: https://www.linkedin.com/in/robertcorentin/
- Issues Tab: Report bugs or request features in the Actor's Issues tab
- Actor Support: Contact support through the Apify platform
- Documentation: Check the Apify Academy for tutorials and guides
We're always open to feedback and suggestions to improve the Actor!
Related Actors
Looking for other real estate data? Check out our other real estate scrapers:
- ERA Immobilier Scraper - Extract agency and agent data from ERA Immobilier
- IAD Scraper - Extract advisor data from IAD l'agence
- Capifrance Scraper - Extract advisor data from Capifrance
- Reseau Expertimo Scraper - Extract agency data from Reseau Expertimo
- LGM Immobilier Scraper - Extract advisor data from LGM Immobilier
Ready to extract Orpi advisor data? Start the Actor and get comprehensive real estate advisor information in minutes!