📧Extract Emails, Socials & Contacts from Any Website✨
Pricing
from $1.00 / 1,000 website processeds
📧Extract Emails, Socials & Contacts from Any Website✨
Instantly extract emails, social media profiles, phone numbers, and contact details from any website. Save hours of manual research and build targeted lead lists effortlessly. Handles bulk lists of 1000+ websites. Extracts from contact pages, about pages, and homepage automatically.
Pricing
from $1.00 / 1,000 website processeds
Rating
0.0
(0)
Developer
Sept Solutions
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Website Contact & Social Extractor
Apify Actor that extracts contact information, social media links, and key page URLs from websites. Built with Crawlee PlaywrightCrawler and migrated from a production Puppeteer extraction pipeline.
Features
- Email extraction — scans visible page text and
mailto:links, deduplicates and normalizes addresses - Phone extraction — matches US-style numbers in body text and
tel:links, formats as(AAA) BBB-CCCCwith optional+1prefix when explicitly present - Social links — finds the first link for LinkedIn, Facebook, Instagram, Twitter/X, YouTube, TikTok, Pinterest, Snapchat, WhatsApp, Telegram, and Skype
- Contact & about pages — discovers and records the first contact and about page URLs on the homepage
- Sub-page crawling — follows same-origin links matching configurable keywords (default:
contact,about,locations) and merges data from up tomaxLinkPagessub-pages - Concurrency — processes multiple websites in parallel via Apify/Crawlee autoscaled pool
- Anti-bot handling — optional stealth plugin, browser hardening, Cloudflare challenge wait, and Crawlee
handleCloudflareChallenge - Resource optimization — blocks images, media, fonts, and stylesheets on the main page (safe for text/href extraction)
- Per-URL error isolation — a failed URL does not stop the rest of the run
Input
| Field | Type | Default | Description |
|---|---|---|---|
websiteUrls | string[] | (required) | Websites to analyze. https:// is added automatically if missing. |
maxConcurrency | integer | 5 | Max parallel browser tabs |
maxLinkPages | integer | 5 | Max contact/about/location sub-pages per site |
requestTimeoutSecs | integer | 30 | Main page navigation timeout (seconds) |
stealth | boolean | true | Enable stealth plugin and browser hardening |
blockHeavyResources | boolean | true | Block images, media, fonts, stylesheets on main page |
retries | integer | 2 | Retries after first attempt (2 = up to 3 total tries) |
retryDelayMs | integer | 2000 | Delay between retries (milliseconds) |
finderKeywords | string[] | ["contact","about","locations"] | Keywords matched in sub-page link hrefs |
Example input
{"websiteUrls": ["https://example.com","https://example.org"],"maxConcurrency": 5,"maxLinkPages": 5,"requestTimeoutSecs": 30,"stealth": true,"blockHeavyResources": true,"retries": 2}
Output
One dataset item per input URL.
Success example
{"url": "https://example.com","title": "Example Domain","phones": ["(555) 123-4567"],"emails": ["info@example.com"],"linkedin": "","facebook": "","instagram": "","twitter": "","youtube": "","tiktok": "","pinterest": "","snapchat": "","whatsapp": "","telegram": "","skype": "","contact_page_url": "https://example.com/contact","about_page_url": "https://example.com/about"}
Failure example
{"url": "https://unreachable.example","error": "page.goto: Timeout 30000ms exceeded."}
Output fields
| Field | Type | Description |
|---|---|---|
url | string | Input website URL |
title | string | HTML <title> |
phones | string[] | US-formatted phone numbers |
emails | string[] | Deduplicated emails |
linkedin … skype | string | First matching social link (empty if none) |
contact_page_url | string | First contact page href found |
about_page_url | string | First about page href found |
error | string | Present only when extraction failed |
Usage
Apify Console
- Open the Actor in Apify Console.
- Paste your input JSON.
- Click Start.
- Download results from the Dataset tab (JSON, CSV, Excel).
Apify API
curl -X POST "https://api.apify.com/v2/acts/YOUR_USERNAME~website-contact-extractor/runs?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"websiteUrls":["https://example.com"]}'
Apify CLI
$apify call YOUR_USERNAME/website-contact-extractor --input '{"websiteUrls":["https://example.com"]}'
Local development
Prerequisites
- Node.js 18+
- Apify CLI (optional, recommended)
Setup
cd backendnpm install
Run locally
Create storage/key_value_stores/default/INPUT.json:
{"websiteUrls": ["https://example.com"],"maxConcurrency": 1,"stealth": true}
Then run:
$npm start
Or with Apify CLI:
$apify run -p
Results are written to storage/datasets/default/.
Apify deployment
cd backendapify loginapify push
The Actor uses the apify/actor-node-playwright-chrome:20 Docker image defined in Dockerfile.
Actor Store description
Website Contact & Social Extractor enriches lead lists and company databases by automatically collecting emails, phone numbers, social profiles, and contact/about page URLs from any website.
Ideal for:
- Lead generation — build contact lists from company websites
- Sales enrichment — add phones and social links to CRM records
- Market research — collect public contact data at scale
- Due diligence — verify how businesses present contact information online
Runs fully in the cloud on Apify with configurable concurrency, retries, and anti-bot options.
Limitations
- US phone bias — phone formatting targets US numbers; international numbers may appear unformatted
- Same-origin sub-pages only — contact/about/location links on external domains are not followed
- Static extraction — reads rendered DOM text and links; does not execute custom per-site scraping logic
- Bot-protected sites — heavily protected sites (Cloudflare, CAPTCHA) may return partial or empty results
- No deep crawl — only the homepage plus up to
maxLinkPageskeyword-matched sub-pages are visited - First-match social links — returns the first anchor per platform, not all profiles
Project structure
backend/├── .actor/ # Apify Actor definition and schemas├── src/│ ├── main.js # Actor entry point│ ├── crawler.js # PlaywrightCrawler setup│ ├── extractors.js # Page-level extraction│ ├── link-pages.js # Sub-page discovery and extraction│ ├── result-merger.js│ ├── browser-hooks.js│ ├── constants.js│ ├── utils.js│ └── config.js├── Dockerfile├── package.json└── README.md
License
ISC