Espaces Atypiques · Full profiles infos (emails, phones)
Pricing
from $9.99 / 1,000 results
Espaces Atypiques · Full profiles infos (emails, phones)
Under maintenanceExport Espaces Atypiques team members from public agency pages: names, emails, phones, photos, job titles, and bios. Director vs staff grouping for CRM. Run all French agencies from the sitemap or paste specific agency URLs. No login or API key.
Pricing
from $9.99 / 1,000 results
Rating
0.0
(0)
Developer
Corentin Robert
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Espaces Atypiques — agency team export
Who it’s for: sales, recruiters, and analysts who need a single CRM-ready table of everyone published on Espaces Atypiques agency “team” pages — no login, no API key (public website only).
What you get per person: name, job title, director vs staff band, email, phone, photo URL, long biography text, office label, and the agency page URL you can cite or revisit. Rows without an email are not exported (dataset / CSV), so you only get contactable leads.
Why one Actor: the site lists many agencies; this scraper starts from the public agency sitemap (same kind of list search engines use), fetches each French agency page, and reads the team block. You can also paste a handful of agency URLs for a quick test or a sub-region.
How to run (Apify Console)
- Choose Sitemap (full France) or URLs (your list).
- Optional: in Input JSON, set
maxAgenciesto a small number (e.g.5) for a quick test; or passmaxResults(API) whenmaxAgenciesis 0 to mirror the France realty hub — sitemap only. Defaults are no cap on agencies and on listing fiches in fallback mode. - Start the run; open the Dataset or Run log when it finishes.
Input
| Key | Type | Default | Purpose |
|---|---|---|---|
mode | sitemap | urls | sitemap | Full sitemap vs pasted agency pages |
startUrls | array of { "url": "…" } | [] | French agency page URLs (URLs mode) |
maxAgencies | integer | 0 | Sitemap: cap office pages; 0 = use maxResults if set, else no cap. URLs: 0 = all pasted URLs; 1+ caps the list. |
maxResults | integer | (none) | API / automation only (not in Console schema). Sitemap: when maxAgencies is 0, caps how many agency pages to load (same as hub Max rows). Ignored in URLs mode. |
maxListingPages | integer | 0 | Listings fallback: max ventes/… pages per agency (0 = no cap; 1–200 to limit) |
listingConcurrency | integer | 6 | Parallel ventes/… fetches per agency in fallback |
concurrency | integer | 25 | Parallel agency page fetches (bounded) |
fetchTimeout | integer | 60000 | Per-request timeout (ms) |
proxyConfiguration | object | (Apify) | Optional proxy; same pattern as other Actors on the Apify platform |
additionalProperties: true — you can pass extra keys from the API if you automate runs.
Example (quick test, Input JSON):
{"mode": "sitemap","maxAgencies": 5,"concurrency": 10}
Example (specific agencies):
{"mode": "urls","startUrls": [{ "url": "https://www.espaces-atypiques.com/paris-rive-gauche/" }],"concurrency": 5}
Output fields (main)
| Field | Meaning |
|---|---|
sourceKey | Stable id (email-based, or path + name if email missing) |
firstName / lastName / fullName | Normalized for export: prénom in title case (fr-FR), nom in MAJUSCULES; fullName is Prénom NOM |
jobTitle | Raw line under the name on the site |
roleCategory | director, associate_director, advisor, support, other, unknown (heuristic) |
band | director (management) vs employee (includes advisors and unknown titles) |
isDirector / isEmployee | Booleans derived from band |
email / phone | When shown on the card; phone is normalized to +33 X XX XX XX XX (French numbers, 0… → +33 …) |
agencyName | Office label to use in exports: from the page h1 when specific; otherwise from the last URL segment (city slug) when the heading is generic (e.g. Nos collaborateurs) |
agencyNameRaw | First h1 / header text used before resolution (for debugging) |
agencyPath | Path on the site, e.g. /grand-est/strasbourg/ |
agencyUrlRegion | First path segment (area). On single-segment agency URLs (/blois/), it is the same slug as the city column so the field is never empty. |
agencyUrlCitySlug | Last path segment (city slug as on the site) |
agencyLabel | Sub-label on the card (e.g. office name) |
agencyPageUrl | Sitemap / input agency URL (stable office id) |
dataFetchedFrom | Set when cards were read from …/collaborateurs/ instead of the home page |
photoUrl / bio | Image and biography text when present |
Role detection uses French titles (normalized): e.g. Directeur d’agence → director band; Conseiller → staff. Titles the parser does not recognise are kept in jobTitle and classified as other.
Local development
cd espaces-atypiques-agents-scrapernpm installecho '{"mode":"sitemap","maxAgencies":2,"concurrency":3}' > input.jsonapify run --input-file=./input.json
If the CLI expects storage/key_value_stores/default/INPUT.json, use --input-file as above, or add a valid INPUT.json that matches the schema. After a run with a non-empty dataset, a semicolon CSV (UTF-8 BOM) is written to output.csv in the project root. Rows are sorted by agency path, then last name, first name, email. Columns are fixed (no dynamic Extra: fields); sourceKey is the last column; roleCategory remains on the dataset record but is omitted from the CSV to avoid overlap with band / jobTitle.
How it works (technical)
- Sitemap mode:
GET /sitemap_index.xml→ findagence-sitemap.xml→GETthat file → keep<loc>URLs that are not under/en/(French pages only). - Fetch each agency home page (bounded concurrency, retries; redirects are followed).
- Parse the team from
.preview-collaborateur(prefer#equipe-trombi) when amailtois present. - If the home page has no team block, do not call
/collaborateurs/(it often redirects to a national page). Instead, collectventes/…URLs from the home grid (#annonces-localisations,#biens,.biens-localisation) and from the CTA that points to/ventes/?pl=…(hiddenplin the form): the Actor follows paginated search results (/ventes/page/2/?pl=…, etc.) until it has enough links or there is no next page. Then it opens each listing and parses the negotiator card (name, phone, email — includingmailtoin HTML comments). Total opened listings are capped bymaxListingPages(default0= no cap; set1–200to limit each office — useful for smoke tests). Large full runs use more time and Apify resources. Internal safety still stops after many WordPress list pages perplfilter.listingConcurrencycontrols parallel listing fetches per agency. - Dedupe rows by
sourceKey(email) and push to the default dataset.agencyPageUrlis always the sitemap office URL;dataFetchedFrompoints at the listing URL when the row came from the listings fallback.
Compliance
Data is taken from public pages. You are responsible for using it in line with applicable law, the site’s terms, and GDPR (prospecting rules, contact preferences).