Pricing

$10.00 / 1,000 results

Website Email, Phone & Social Data Extract

Extract emails, phone numbers, and social media profiles from websites. Automatic normalization (E.164), deduplication, and smart filtering. Intelligent crawling with adaptive depth (8-15 pages). Fast and efficient with Cheerio/HTTP and Playwright fallback.

Pricing

$10.00 / 1,000 results

Rating

0.0

(0)

Developer

My Smart Digital

Actor stats

Bookmarked

Total users

Monthly active users

16 days ago

Last modified

Description

This Actor processes a list of domains and automatically extracts:

✅ Emails: Detection from mailto links, raw text, and JSON-LD schemas. Automatic normalization, filtering, and deduplication.
✅ Phone Numbers: Extraction from tel: links and raw text. E.164 normalization with automatic country detection.
✅ Social Media: LinkedIn, Facebook, Instagram, Twitter/X, TikTok, YouTube, Pinterest, Google Maps. Filtering of share links and service links.

Features

✅ Intelligent Crawling: Automatic detection of key pages (contact, about, legal, privacy). Adaptive crawl (8-15 pages depending on site structure)
✅ Fast and Efficient: Uses Cheerio/HTTP by default, Playwright only as fallback for dynamic pages
✅ Deterministic: Stable and traceable results (sourceUrl + snippet for each extraction)
✅ Deduplication: Emails and phones automatically deduplicated
✅ Smart Selection: Primary email and phone selected according to precise rules
✅ E.164 Normalization: Phone numbers normalized to international format with automatic country detection
✅ Smart Filtering: Automatic exclusion of test emails, public authorities, invalid numbers
✅ Resilience: Automatic error handling, retry on timeout, attempts on URL variants (http/https, www/non-www)

Input

Required Parameters

startUrls (array or string): List of URLs to process. Array format [{ url: "https://example.com" }] or multi-line text (one URL per line).

Optional Parameters

timeoutSecs (number, default: 30): Request timeout in seconds (5-120)
usePlaywrightFallback (boolean, default: true): Use Playwright for dynamic pages if HTTP fails
includeContacts (boolean, default: true): Extract emails and phones
includeSocials (boolean, default: true): Extract social media links
keyPaths (array, default: []): Custom paths to override default key paths

Input Example

{
  "startUrls": [
    { "url": "https://example.com" },
    { "url": "https://another-domain.com" }
  ],
  "timeoutSecs": 30,
  "includeContacts": true,
  "includeSocials": true
}

Output

A single JSON record per domain in the default dataset.

Record Structure

{
  "domain": "example.com",
  "finalUrl": "https://example.com",
  "keyPages": {
    "contact": "https://example.com/contact",
    "about": "https://example.com/about",
    "legal": "https://example.com/legal",
    "privacy": "https://example.com/privacy"
  },
  "pagesVisited": [
    "https://example.com",
    "https://example.com/contact",
    "https://example.com/about"
  ],
  "emails": [
    {
      "value": "contact@example.com",
      "type": "general",
      "priority": "primary",
      "signals": ["mailto", "same_domain"],
      "sourceUrl": "https://example.com/contact",
      "snippet": "Contact us at contact@example.com",
      "foundIn": "mailto"
    }
  ],
  "primaryEmail": "contact@example.com",
  "phones": [
    {
      "valueRaw": "+33 1 23 45 67 89",
      "valueE164": "+33123456789",
      "priority": "primary",
      "signals": ["tel", "footer_or_contact"],
      "sourceUrl": "https://example.com/contact",
      "snippet": "Call us: +33 1 23 45 67 89"
    }
  ],
  "primaryPhone": "+33123456789",
  "socials": {
    "linkedin": [
      {
        "url": "https://www.linkedin.com/company/example-corp",
        "handle": "example-corp",
        "sourceUrl": "https://example.com"
      }
    ],
    "facebook": [
      {
        "url": "https://www.facebook.com/examplecorp",
        "handle": "examplecorp",
        "sourceUrl": "https://example.com"
      }
    ]
  },
  "errors": []
}

Main Fields

domain: Registrable domain (e.g., "example.com")
finalUrl: Final URL after redirects
keyPages: Detected key pages (contact, about, legal, privacy)
pagesVisited: List of crawled pages for this domain
emails: List of extracted emails with metadata
primaryEmail: Primary email selected (same-domain > mailto > contact page)
phones: List of extracted phones with E.164 normalization
primaryPhone: Primary phone selected (footer/contact > tel: > E.164)
socials: Social media by platform
errors: Errors encountered during crawl (if present)

Crawl Strategy

Priority Key Pages

The Actor automatically detects and visits the following key pages:

Contact: /contact, /contact-us, /nous-contacter
About: /about, /about-us, /a-propos
Legal: /legal, /mentions-legales, /imprint
Privacy: /privacy, /politique-de-confidentialite

Crawl Tiers (Internal)

The Actor uses two internal crawl tiers (non-configurable):

Standard: Maximum 8 pages per domain (default)
Deep: Maximum 15 pages per domain (automatic activation)

Deep mode is automatically activated if:

The site is highly structured (4+ relevant key pages)
A Playwright fallback is required for dynamic pages

Important: Tier change does not affect output. A single record is always produced per domain.

Extraction

Emails

Detection: mailto: links, raw text (regex), JSON-LD schema.org
Normalization: Lowercase, trim, final punctuation removal
Filtering: Excludes noreply, donotreply, example, test, public authorities (agpd.es, cnil.fr, etc.), test emails (mail.com, example.com, etc.)
Deduplication: On normalized email (lowercase)
Primary selection: Same-domain > mailto > contact page > first valid
Validation: Exclusion of emails concatenated with phone numbers

Phone Numbers

Detection: tel: links, raw text (international regex)
Normalization: valueRaw (original) + valueE164 (if possible via libphonenumber-js)
Country detection: Automatic from URL (TLD, subdomain) and context
Filtering: Excludes SIRET, VAT, non-phone numbers, fax, GPS coordinates, dates
Deduplication: On valueE164 if available, otherwise digitsOnly(valueRaw)
Primary selection: Footer/contact > tel: > E.164 > first valid
Validation: Exclusion of invalid numbers (>15 digits, incorrect formats)

Platforms: LinkedIn (company), Facebook, Instagram, Twitter/X, TikTok, YouTube, Pinterest, Google Maps
Filtering: Excludes share links, settings/policies, services (Wix, Dropbox, Google Drive, OneDrive)
Deduplication: By normalized URL and handle
Validation: Exclusion of individual Instagram posts, internal links

Error Handling

Retry: Automatic attempts on timeout/network/429/5xx only
No retry: On 404 (page not found)
Timeout: Per request (timeoutSecs), no global timeout per domain
Resilience: Errors are recorded in errors[] without blocking processing
URL Variants: Automatic attempts on variants (http/https, www/non-www, hyphens)

Limitations

Maximum 200 domains per execution
No proxy (direct crawl)
No configurable robots.txt respect
No OCR or image scraping
Single result per domain (www/non-www canonicalization)

Mobile.de Scraper PRO 🔥 $1/1K (By search URL)

azzouzana/mobile-de-scraper-pro-by-search-URL

#1 ⚡ Blazing fast & super intuitive! This Mobile.de scraper extracts search results ads data including prices, mileage, addresses, contacts, attributes, images & much more. Monitors new listings, price drops & delisted vehicles. Bring your search URL and you're good to go! 🚀

Azzouzana

5.0

🚗📝👨‍⚕️ Mobile.de Listing Health Checker

3x1t/mobile-de-listing-health-checker

Check which listings are no longer active on Mobile.de thanks to this simple utility.

3x1t

🔍🚗 Mobile.de Scraper

3x1t/mobile-de-scraper

Effortlessly scrape car data from Germany's largest vehicle marketplace, Mobile.de. Get access to millions of entries of cars, motorbikes, etc. across Europe. Fast, cheap & reliable. Rental version for larger use cases.

3x1t

427

5.0

Mobile.de Auto Scraper

lexis-solutions/mobile-de-auto-scraper

Scrape car data from mobile.de. Simple and powerful CSV/JSON car data extraction. Scrape new and used cars for sale on mobile.de.

Lexis Solutions

5.0

Mobile.de Scraper

ivanvs/mobile-de-scraper

Extract data from mobile.de for data on thousands of car listings. Scrape car listings, extract descriptions, images, prices, mileage, contact number, addresses, names, engine information, the transmission of the car, and all other listing details.

Gen First

124

5.0

🔍🚗 Mobile.de Scraper (PPR)

3x1t/mobile-de-scraper-ppr

3x1t

269

Autoscout24 Automotive Details Scraper

ecomscrape/autoscout24-automotive-details-scraper

AutoScout24 scraper enables automated extraction of comprehensive vehicle listings from Europe's largest online car marketplace. Extract detailed automotive data including prices, specifications, dealer information, and vehicle conditions from over 2 million listings across 18 European countries.

ecomscrape

Mobile.de [$0.8💰] Search & Detail & Unlimited [Richest Output]

memo23/mobile-de-scraper

💰 $0.80 per 1,000 results – No limits, unlimited extraction. Extract comprehensive vehicle data: make/model, pricing, technical specs, dealer info, images, features, condition, location, contact details, and market ratings. Perfect for automotive research, inventory tracking, and price analysis

Muhamed Didovic

5.0

🔍🚙 AutoScout24 Scraper (PPR)

3x1t/autoscout24-scraper-ppr

Effortlessly scrape car data from the largest pan-European online car marketplace, AutoScout24. Get access to millions of entries of cars, motorbikes, etc. across Europe. Fast, cheap & reliable. Pay-per-result version.

3x1t

203

5.0

AutoUncle Car Listings Scraper

lofomachines/autouncle-scraper

Professional Autouncle scraper to extract used car listings, prices, specifications, and dealer information. Fast, reliable, and Cheaper Perfect for market analysis, price monitoring, and lead generation.