📧Extract Emails, Socials & Contacts from Any Website✨ avatar

📧Extract Emails, Socials & Contacts from Any Website✨

Pricing

from $1.00 / 1,000 website processeds

Go to Apify Store
📧Extract Emails, Socials & Contacts from Any Website✨

📧Extract Emails, Socials & Contacts from Any Website✨

Instantly extract emails, social media profiles, phone numbers, and contact details from any website. Save hours of manual research and build targeted lead lists effortlessly. Handles bulk lists of 1000+ websites. Extracts from contact pages, about pages, and homepage automatically.

Pricing

from $1.00 / 1,000 website processeds

Rating

0.0

(0)

Developer

Sept Solutions

Sept Solutions

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Categories

Share

Website Contact & Social Extractor

Apify Actor that extracts contact information, social media links, and key page URLs from websites. Built with Crawlee PlaywrightCrawler and migrated from a production Puppeteer extraction pipeline.

Features

  • Email extraction — scans visible page text and mailto: links, deduplicates and normalizes addresses
  • Phone extraction — matches US-style numbers in body text and tel: links, formats as (AAA) BBB-CCCC with optional +1 prefix when explicitly present
  • Social links — finds the first link for LinkedIn, Facebook, Instagram, Twitter/X, YouTube, TikTok, Pinterest, Snapchat, WhatsApp, Telegram, and Skype
  • Contact & about pages — discovers and records the first contact and about page URLs on the homepage
  • Sub-page crawling — follows same-origin links matching configurable keywords (default: contact, about, locations) and merges data from up to maxLinkPages sub-pages
  • Concurrency — processes multiple websites in parallel via Apify/Crawlee autoscaled pool
  • Anti-bot handling — optional stealth plugin, browser hardening, Cloudflare challenge wait, and Crawlee handleCloudflareChallenge
  • Resource optimization — blocks images, media, fonts, and stylesheets on the main page (safe for text/href extraction)
  • Per-URL error isolation — a failed URL does not stop the rest of the run

Input

FieldTypeDefaultDescription
websiteUrlsstring[](required)Websites to analyze. https:// is added automatically if missing.
maxConcurrencyinteger5Max parallel browser tabs
maxLinkPagesinteger5Max contact/about/location sub-pages per site
requestTimeoutSecsinteger30Main page navigation timeout (seconds)
stealthbooleantrueEnable stealth plugin and browser hardening
blockHeavyResourcesbooleantrueBlock images, media, fonts, stylesheets on main page
retriesinteger2Retries after first attempt (2 = up to 3 total tries)
retryDelayMsinteger2000Delay between retries (milliseconds)
finderKeywordsstring[]["contact","about","locations"]Keywords matched in sub-page link hrefs

Example input

{
"websiteUrls": [
"https://example.com",
"https://example.org"
],
"maxConcurrency": 5,
"maxLinkPages": 5,
"requestTimeoutSecs": 30,
"stealth": true,
"blockHeavyResources": true,
"retries": 2
}

Output

One dataset item per input URL.

Success example

{
"url": "https://example.com",
"title": "Example Domain",
"phones": ["(555) 123-4567"],
"emails": ["info@example.com"],
"linkedin": "",
"facebook": "",
"instagram": "",
"twitter": "",
"youtube": "",
"tiktok": "",
"pinterest": "",
"snapchat": "",
"whatsapp": "",
"telegram": "",
"skype": "",
"contact_page_url": "https://example.com/contact",
"about_page_url": "https://example.com/about"
}

Failure example

{
"url": "https://unreachable.example",
"error": "page.goto: Timeout 30000ms exceeded."
}

Output fields

FieldTypeDescription
urlstringInput website URL
titlestringHTML <title>
phonesstring[]US-formatted phone numbers
emailsstring[]Deduplicated emails
linkedinskypestringFirst matching social link (empty if none)
contact_page_urlstringFirst contact page href found
about_page_urlstringFirst about page href found
errorstringPresent only when extraction failed

Usage

Apify Console

  1. Open the Actor in Apify Console.
  2. Paste your input JSON.
  3. Click Start.
  4. Download results from the Dataset tab (JSON, CSV, Excel).

Apify API

curl -X POST "https://api.apify.com/v2/acts/YOUR_USERNAME~website-contact-extractor/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"websiteUrls":["https://example.com"]}'

Apify CLI

$apify call YOUR_USERNAME/website-contact-extractor --input '{"websiteUrls":["https://example.com"]}'

Local development

Prerequisites

  • Node.js 18+
  • Apify CLI (optional, recommended)

Setup

cd backend
npm install

Run locally

Create storage/key_value_stores/default/INPUT.json:

{
"websiteUrls": ["https://example.com"],
"maxConcurrency": 1,
"stealth": true
}

Then run:

$npm start

Or with Apify CLI:

$apify run -p

Results are written to storage/datasets/default/.

Apify deployment

cd backend
apify login
apify push

The Actor uses the apify/actor-node-playwright-chrome:20 Docker image defined in Dockerfile.

Actor Store description

Website Contact & Social Extractor enriches lead lists and company databases by automatically collecting emails, phone numbers, social profiles, and contact/about page URLs from any website.

Ideal for:

  • Lead generation — build contact lists from company websites
  • Sales enrichment — add phones and social links to CRM records
  • Market research — collect public contact data at scale
  • Due diligence — verify how businesses present contact information online

Runs fully in the cloud on Apify with configurable concurrency, retries, and anti-bot options.

Limitations

  • US phone bias — phone formatting targets US numbers; international numbers may appear unformatted
  • Same-origin sub-pages only — contact/about/location links on external domains are not followed
  • Static extraction — reads rendered DOM text and links; does not execute custom per-site scraping logic
  • Bot-protected sites — heavily protected sites (Cloudflare, CAPTCHA) may return partial or empty results
  • No deep crawl — only the homepage plus up to maxLinkPages keyword-matched sub-pages are visited
  • First-match social links — returns the first anchor per platform, not all profiles

Project structure

backend/
├── .actor/ # Apify Actor definition and schemas
├── src/
│ ├── main.js # Actor entry point
│ ├── crawler.js # PlaywrightCrawler setup
│ ├── extractors.js # Page-level extraction
│ ├── link-pages.js # Sub-page discovery and extraction
│ ├── result-merger.js
│ ├── browser-hooks.js
│ ├── constants.js
│ ├── utils.js
│ └── config.js
├── Dockerfile
├── package.json
└── README.md

License

ISC