Pricing

$9.99/month + usage

Website Email & Contact Extractor: Lead Generation Tool

Extract emails, phone numbers and social media links from any website. Auto-scans homepage plus contact and about pages. Returns verified leads with LinkedIn, Twitter, Instagram profiles. Perfect for B2B outreach and lead generation.

Pricing

$9.99/month + usage

Rating

3.0

(1)

Developer

Scrape Pilot

Actor stats

Bookmarked

Total users

Monthly active users

16 hours

Issues response

2 months ago

Last modified

📧 Website Email & Contact Extractor: Lead Generation Tool

Instantly extract emails, phone numbers, social media links, and addresses from any website. The most reliable Website Email & Contact Extractor on Apify — built for lead generation, B2B prospecting, and sales outreach at scale. No login, no API keys, no limits.

📌 Table of Contents

What Is This Actor?
Why Choose This Website Email & Contact Extractor?
Use Cases
What Data Is Extracted?
How It Scans Each Website
Output Fields (Full Reference)
Input Parameters
Example Inputs & Outputs
Keyword Filtering
Social Media Link Extraction
Address Extraction
Status Codes Explained
Proxy Configuration
Performance & Rate Limits
FAQ
Changelog
Legal & Terms of Use

🔍 What Is This Actor?

Website Email & Contact Extractor is a production-grade Apify actor that crawls any website and extracts all publicly available contact information — emails, phone numbers, social media profile links, and physical addresses — from the homepage and up to 5 sub-pages (contact, about, team, impressum) per domain.

This Website Email & Contact Extractor is designed for sales teams, marketing agencies, recruiters, and developers who need to build targeted lead lists at scale without manually visiting each website. Provide a list of URLs and receive back a clean, structured dataset of contact data — ready for CRM import, outreach campaigns, or market research.

Whether you need to extract 10 emails from a niche competitor list or scrape contact data from 1,000 company websites for a B2B lead generation campaign, this actor handles it reliably with smart deduplication, false-positive filtering, and residential proxy support.

🚀 Why Choose This Website Email & Contact Extractor?

Feature	This Actor	Manual Research	Generic Scrapers
Email extraction from any website	✅ mailto + text + HTML	❌	⚠️ Partial
Phone number extraction	✅ tel: links + text	❌	⚠️
Social media profile links	✅ 7 platforms	❌	⚠️
Physical address extraction	✅ Schema.org + CSS	❌	❌
Multi-page scan per domain	✅ Up to 5 pages	❌	❌
False positive filtering	✅ Smart skip-list	❌	❌
Bulk URL processing	✅ 1–1000 sites	❌	⚠️
Keyword filtering	✅ Built-in	❌	❌
Residential proxy support	✅ Built-in	❌	⚠️
Duplicate removal	✅ Smart dedup	❌	❌

Bottom line: This Website Email & Contact Extractor gives you verified, deduplicated contact data across all pages of a website — faster and more accurately than any manual process.

🎯 Use Cases

📬 B2B Lead Generation

Use this Website Email & Contact Extractor to build targeted prospect lists for outbound sales campaigns. Feed in a list of company websites in your target industry and extract all decision-maker contact emails, direct phone lines, and LinkedIn profiles automatically.

🏢 Sales Prospecting & CRM Building

Extract contact data from hundreds of business websites in your target segment. Enrich your CRM with emails, phones, and social links without manual data entry. This Website Email & Contact Extractor turns raw URL lists into actionable sales intelligence.

📣 Marketing & Outreach Campaigns

Build cold email lists, identify influencer contacts, or find PR and media contacts from publisher websites. This contact extractor finds every publicly listed email address across the homepage, contact page, and about page.

🔍 Competitor Research

Scrape contact pages from competitor websites to understand their support structure, regional offices, and social media presence. The Website Email & Contact Extractor maps out the full contact footprint of any business.

🤝 Partnership & Sponsorship Outreach

Extract partnership or sponsorship contact emails from brand websites and media companies. Find the right contact quickly without browsing through multiple pages manually.

💼 Recruitment & HR

Extract contact emails from company career and team pages for headhunting and direct outreach. This Website Email & Contact Extractor finds emails even when they are not in an obvious mailto: link — including plain-text and obfuscated addresses in page content.

🌍 Local Business Data Collection

Collect phone numbers, addresses, and email addresses from local business websites for directory building, market mapping, or regional sales coverage analysis.

📊 Market Research & Data Enrichment

Enrich a dataset of company domains with contact data. Match extracted emails with LinkedIn profiles via social_links output, verify physical addresses, and cross-reference phone numbers — all in one run.

📊 What Data Is Extracted?

This Website Email & Contact Extractor pulls the following contact data from each website:

✉️ Email Addresses

Extracted from mailto: links (highest accuracy)
Extracted from visible page text using regex pattern matching
Extracted from raw HTML source including obfuscated patterns
False positives filtered out: no-reply, example.com, asset files, CDN domains, tracking pixels

📞 Phone Numbers

Extracted from tel: links (highest accuracy — exact format preserved)
Extracted from visible page text supporting all international formats
Validated by digit count (7–15 digits — rejects ZIP codes and short strings)

Extracted from all pages for these 7 platforms:

LinkedIn — company pages and personal profiles
Facebook — business pages
Twitter / X — brand accounts
Instagram — business profiles
YouTube — channels
TikTok — brand accounts
GitHub — organization and user profiles

🏠 Physical Address

Extracted via Schema.org PostalAddress microdata (highest accuracy)
Falls back to CSS class/ID patterns like .address, .location

🏷️ Company Name

Extracted from og:site_name meta tag (most accurate)
Falls back to <title> tag with common suffixes stripped
Falls back to domain name if nothing else is found

⚙️ How It Scans Each Website

This Website Email & Contact Extractor uses a smart multi-page crawl strategy to maximize contact data coverage:

Phase 1 — Homepage Scan

The actor fetches the homepage and immediately extracts all emails, phones, social links, and address data. Most business websites list at least their social profiles and support email on the homepage.

Phase 2 — Contact Page Discovery

Using keyword matching on link text and href attributes, the actor finds internal links to contact-related pages such as: contact, about, team, reach, connect, get-in-touch, support, imprint, impressum, legal.

Phase 3 — Sub-Page Scanning

The actor visits up to pages_to_scan - 1 discovered contact pages (default: 4 additional pages after homepage). Contact pages almost always contain the most complete and accurate contact data.

Phase 4 — Deduplication & Assembly

All extracted emails, phones, socials, and addresses are deduplicated across all scanned pages and assembled into one clean record per domain. A status flag (Verified, Partial, No Data) is assigned based on completeness.

Smart Filtering

Email false positives removed: CDN domains, asset domains, schema.org, no-reply addresses, and common test/example addresses are all excluded automatically
Phone false positives removed: Strings shorter than 7 digits or longer than 15 digits are rejected
Social link false positives removed: Share buttons, login URLs, dialog intents, and plugin embeds are all excluded

📋 Output Fields (Full Reference)

Each record produced by this Website Email & Contact Extractor contains:

Field	Type	Description	Example
`domain`	string	Website domain	`"www.example.com"`
`company_name`	string	Extracted company name	`"Acme Corporation"`
`emails`	array	All unique emails found (max 20)	`["info@example.com", "sales@example.com"]`
`phone_numbers`	array	All unique phone numbers found (max 10)	`["+1 (800) 555-0100", "+44 20 7946 0958"]`
`social_links`	object	Social media profile URLs by platform	`{"linkedin": "https://linkedin.com/company/...", "twitter": "https://x.com/..."}`
`address`	string	Physical address (if found)	`"123 Main St, New York, NY 10001, US"`
`source_url`	string	The input URL that was scraped	`"https://www.example.com"`
`pages_scanned`	integer	Number of pages crawled for this domain	`4`
`status`	string	Data completeness status	`"Verified"`, `"Partial"`, `"No Data"`
`extracted_at`	string	ISO timestamp of extraction	`"2024-11-01T10:30:00Z"`

{
  "linkedin":  "https://www.linkedin.com/company/example-corp",
  "facebook":  "https://www.facebook.com/ExampleCorp",
  "twitter":   "https://x.com/ExampleCorp",
  "instagram": "https://www.instagram.com/examplecorp",
  "youtube":   "https://www.youtube.com/@ExampleCorp",
  "tiktok":    "https://www.tiktok.com/@examplecorp",
  "github":    "https://github.com/example-corp"
}

Only platforms where a profile link is found are included. Missing platforms are simply absent from the object.

⚙️ Input Parameters

{
  "target_urls": [
    "https://www.hubspot.com",
    "https://www.salesforce.com",
    "https://www.mailchimp.com"
  ],
  "target_url":    "",
  "keyword":       "",
  "pages_to_scan": 5,
  "max_results":   50,
  "proxyConfiguration": {
    "useApifyProxy":    true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Parameter	Type	Default	Description
`target_urls`	array or string	`[]`	List of website URLs to extract contact data from. One URL per item or newline-separated string.
`target_url`	string	`""`	Single website URL shortcut — automatically added to `target_urls`
`keyword`	string	`""`	Optional filter — only return results where this keyword appears in domain, company name, email, or address
`pages_to_scan`	integer	`5`	Number of pages to scan per website (1 = homepage only; 5 = homepage + 4 contact sub-pages)
`max_results`	integer	`50`	Maximum number of websites to process in one run
`proxyConfiguration`	object	Off	Apify proxy config — recommended for large-scale extraction

📦 Example Inputs & Outputs

Example 1: Extract Contact from a Single Website

Input:

{
  "target_url": "https://www.hubspot.com",
  "pages_to_scan": 5
}

Output:

{
  "domain":       "www.hubspot.com",
  "company_name": "HubSpot",
  "emails": [
    "press@hubspot.com",
    "legal@hubspot.com",
    "privacy@hubspot.com"
  ],
  "phone_numbers": [
    "+1 (888) 482-7768"
  ],
  "social_links": {
    "linkedin":  "https://www.linkedin.com/company/hubspot",
    "facebook":  "https://www.facebook.com/hubspot",
    "twitter":   "https://x.com/HubSpot",
    "instagram": "https://www.instagram.com/hubspot",
    "youtube":   "https://www.youtube.com/@HubSpot"
  },
  "address":       "25 First Street, 2nd Floor, Cambridge, MA 02141, United States",
  "source_url":    "https://www.hubspot.com",
  "pages_scanned": 4,
  "status":        "Verified",
  "extracted_at":  "2024-11-01T10:30:00Z"
}

Example 2: Bulk Lead Generation — B2B List

Input:

{
  "target_urls": [
    "https://www.company1.com",
    "https://www.company2.io",
    "https://www.agency3.co.uk",
    "https://www.startup4.com",
    "https://www.firm5.de"
  ],
  "pages_to_scan": 4,
  "max_results": 5
}

Output: 5 records — one per domain — each containing deduplicated emails, phones, social links, addresses, company names, and status codes. Ready for direct CRM import.

Example 3: Keyword-Filtered Extraction

Input:

{
  "target_urls": ["https://www.agency1.com", "https://www.agency2.com", "https://www.tech3.com"],
  "keyword": "marketing",
  "pages_to_scan": 3
}

Behavior: Only returns records where the word "marketing" appears in the domain name, company name, email addresses, or physical address. Websites that do not match are skipped with a log message.

Example 4: Homepage-Only Quick Scan

Input:

{
  "target_urls": ["https://site1.com", "https://site2.com", "https://site3.com"],
  "pages_to_scan": 1
}

Use this for: Large lists where speed matters more than completeness. Setting pages_to_scan: 1 only scans the homepage and skips contact/about page discovery entirely.

🔍 Keyword Filtering

The keyword parameter lets you filter results at extraction time — saving you the need to filter the dataset manually after the run.

When keyword is set, the actor checks whether the keyword appears (case-insensitive) in any of these fields for each record:

domain
company_name
emails (joined as a string)
address

If the keyword is not found in any of these fields, the record is skipped and not included in the output.

Example use cases for keyword filtering:

keyword: "sales" → only return companies with "sales" in their domain or email
keyword: "london" → only return companies with London in their address
keyword: "@agency.com" → only return records with a specific domain email format
keyword: "recruiting" → filter for HR-related websites

This Website Email & Contact Extractor searches the full raw HTML of every scanned page for social media profile URLs matching patterns for 7 platforms. Extracted links are cleaned and validated before inclusion:

Tracking parameters removed from social URLs
Share buttons filtered out — /share, /sharer, /intent/tweet, /dialog/feed paths are all excluded
Login and signup pages excluded — /login, /signup paths filtered
Only one link per platform — the first valid profile URL found per platform is returned
Trailing slashes normalized — for consistent formatting

Social links are returned as a flat object with platform names as keys, making them easy to map into CRM fields or LinkedIn enrichment workflows.

🏠 Address Extraction

Physical address extraction uses a two-tier approach:

Tier 1 — Schema.org Structured Data (Highest Accuracy) Websites using PostalAddress or LocalBusiness microdata markup have structured address fields that the actor reads directly: streetAddress, addressLocality, addressRegion, postalCode, addressCountry. This is common on e-commerce sites, service businesses, and sites built with modern CMS platforms.

Tier 2 — CSS Pattern Matching (Fallback) For sites without Schema.org markup, the actor searches for common address containers using selectors like .address, .location, [class*='address'], [id*='address']. Text is extracted and length-validated (10–200 characters) to filter out noise.

Address extraction works best on business websites that display their address in a footer, contact page, or sidebar. It may not work for websites that intentionally hide or obfuscate their physical location.

🏷️ Status Codes Explained

Every record from this Website Email & Contact Extractor includes a status field:

Status	Meaning
`"Verified"`	At least one email and at least one phone number were found
`"Partial"`	Some contact data found (emails only, phones only, or socials only — but not all)
`"No Data"`	No emails, phones, or social links were found on any scanned page

Use the status field to prioritize outreach: "Verified" records are the highest-confidence leads. "Partial" records may still be useful for social outreach even without a direct email.

🌐 Proxy Configuration

Recommended Setup

{
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

When to Use Proxy

Running bulk extraction across 50+ websites
Scraping websites that block known datacenter IPs (media sites, legal firms, financial services)
Scraping websites in regions with geo-restricted access
Any production-scale lead generation run

When Proxy Is Optional

Small runs of under 10 websites
Scraping small business or startup websites
Quick one-off lookups on a single domain

The actor uses curl_cffi with Chrome 110 browser impersonation, which already bypasses most bot detection for small volumes without requiring a proxy.

⚡ Performance & Rate Limits

Speed Benchmarks

Mode	Websites	Pages/Site	Estimated Time
Single site, full scan	1	5	~15–30 seconds
Small batch	10	5	~2–4 minutes
Medium batch	50	3	~8–12 minutes
Large batch	200	2	~25–35 minutes
Homepage only, large batch	100	1	~8–12 minutes

Reliability Features

1.5–3 second delay between each website request to avoid rate limiting
0.5–1.5 second delay between sub-pages within the same domain
Chrome 110 browser impersonation via curl_cffi to minimize bot detection
Graceful failure handling — failed websites are logged and skipped, never crashing the run
Smart false-positive filtering on both emails and phone numbers for clean output

❓ FAQ

Q: What types of email addresses are extracted? A: All publicly visible emails — from mailto: links (highest accuracy), from page text (regex), and from raw HTML. False positives like CDN domains, asset file paths, tracking pixel emails, example.com addresses, and no-reply addresses are automatically filtered out.

Q: Can this extract emails hidden behind JavaScript or contact forms? A: No. This actor extracts emails that are present in the HTML source code. Emails that only appear after JavaScript execution or that are submitted via contact forms are not accessible without a full browser renderer.

Q: Why are some phone numbers showing unusual formats? A: Phone numbers are extracted as-is from page text to preserve their original formatting (which varies by country). You can normalize them post-extraction using a phone parsing library like libphonenumber.

Q: What happens if a website has no contact page? A: The actor scans the homepage only. If no contact sub-pages are discovered via link analysis, only homepage data is extracted. The record is returned with pages_scanned: 1.

Q: Can I use this for Gmail, Outlook, or other webmail login pages? A: No. This Website Email & Contact Extractor only works with public-facing business websites. Login-gated pages, authenticated pages, and private intranets are not accessible.

Q: Is extracted data deduplicated? A: Yes. Emails and phone numbers are deduplicated across all pages scanned for a given domain — you will never see the same email twice in a single record.

Q: How accurate is the company name extraction? A: Very high for websites using the og:site_name meta tag (most modern sites). For older sites, it falls back to the <title> tag with common separators stripped. As a last resort, it uses the domain name formatted as a title.

Q: Can I process 1,000 websites in one run? A: Yes. Set max_results: 1000 and provide 1,000 URLs. For very large runs, residential proxy is strongly recommended. Estimated time: 3–5 hours depending on pages_to_scan setting.

Q: Does this work on non-English websites? A: Yes. Email and phone extraction uses language-agnostic regex patterns. Contact page discovery includes German keywords (kontakt, impressum) in addition to English ones. Address extraction via Schema.org works regardless of the page language.

Q: What is the pages_to_scan parameter and what should I set it to? A: It controls how many pages are scanned per website. 1 = homepage only (fastest). 5 = homepage + up to 4 sub-pages (most complete). For lead generation, 3–5 is recommended. For large bulk runs where speed matters, 1–2 is better.

📜 Changelog

v1.0.0 (Current)

✅ Email extraction from mailto: links, page text, and raw HTML
✅ Phone extraction from tel: links and international format text patterns
✅ Social media link extraction for 7 platforms (LinkedIn, Facebook, Twitter/X, Instagram, YouTube, TikTok, GitHub)
✅ Physical address extraction via Schema.org and CSS fallback
✅ Company name extraction via og:site_name, <title>, and domain fallback
✅ Multi-page scan per domain with automatic contact page discovery
✅ Smart false-positive filtering for emails and phones
✅ Social link cleaning (removes share buttons, login redirects, tracking params)
✅ Keyword filtering support
✅ Verified / Partial / No Data status scoring per record
✅ Residential proxy support via curl_cffi Chrome 110 impersonation
✅ 1.5–3 second random delay between requests for safe rate limiting
✅ Bulk processing up to 1,000 websites per run

⚖️ Legal & Terms of Use

This Website Email & Contact Extractor collects contact data that is publicly visible on business websites — the same information a person would see when visiting the site in a browser.

Please follow these guidelines:

Only extract contact data from websites you have a legitimate reason to access
Use extracted emails for opted-in outreach only where permitted by applicable law
Comply with CAN-SPAM, GDPR, CASL, and other applicable email and data protection regulations in your jurisdiction
Do not use extracted data to send spam, unsolicited bulk emails, or harassing messages
Respect the robots.txt file and Terms of Service of each website you scrape
Do not use this tool for unauthorized competitive intelligence or data resale without consent

GDPR Note: In the EU, publicly listed business contact information (company emails, phone numbers) may be processed for legitimate business purposes. Personal email addresses require a valid legal basis under GDPR Article 6. Always consult a legal professional for your specific use case.

🤝 Support & Feedback

Bug report? Open a GitHub issue or contact via the Apify actor page
Feature request? Drop a suggestion in the Apify Community forum
Works great? Please leave a ⭐ review on the Apify Store — it helps others find this Website Email & Contact Extractor!

Built with ❤️ on Apify · Website Email & Contact Extractor for Lead Generation
Extract emails, phones, social links & addresses from any website — fast, clean, and at scale

Website Contact Extractor

krawlify/website-contact-extractor

Extract emails, phone numbers, and social media links from any website. Perfect for lead generation, sales prospecting, and contact discovery.

Krawlify Krawlify

Contact Info Extractor

optimus-fulcria/contact-info-extractor

Extract emails, phone numbers, social media profiles, and addresses from any website. Auto-follows contact pages. Lead generation ready.

Fulcria Labs

Email & Contact Extractor

blazing_stake/email-contact-extractor

Extract emails, phone numbers and social media links from websites. Auto-follows contact/about pages. Fast HTTP-based lead-generation tool.

Mehmet Kut

Website Contact & Email Extractor

bohard/website-contact-extractor

Crawl any list of websites and extract emails, phone numbers and social media profiles for lead generation.

Bohdan Shtelmakh

Website Contact Extractor - Emails, Phones & Social Links

santhej/website-contact-extractor

Bulk-extract contact details from any list of websites: email addresses, phone numbers, and social profiles (LinkedIn, X, Facebook, Instagram, YouTube). Crawls homepage + contact/about pages. Clean JSON/CSV for lead lists & enrichment.

Santhej Kallada

5.0

Contact Details Scraper – Emails, Phone Numbers & Social Media

davidsharadbhatt/socialprofilescrapper

Extract verified emails, phone numbers, and social media profiles from any website using this Contact Details Scraper. Perfect for lead generation, sales outreach, and business data collection. Automatically find contact info, LinkedIn, Twitter, and company profiles from multiple domains with ease.

David Bhatt

1.0

Email & Phone Extractor

scrapapi/email-and-phone-extractor

Crawl any website and extract emails, phone numbers, and social media profiles (LinkedIn, X/Twitter, Instagram, Facebook, YouTube). Built for lead generation and data enrichment.

ScrapAPI

📧 Website Email Extractor — Bulk Contact Scraper

nexgendata/website-email-extractor

Extract emails, phone numbers & social profiles from any website. Crawls contact/about pages automatically. Hunter.io alternative for lead generation.

NexGenData

Website Email, Phone Numbers & Social Media Profiles Finder

parsebird/website-contact-finder

Extract emails, phone numbers, and social media profiles (LinkedIn, Twitter/X, Facebook, Instagram) from any website. Crawls contact pages, about pages, and footers automatically. Bulk processing for lead generation, sales prospecting, and CRM enrichment. Export as JSON, CSV, Excel.

ParseBird

Email & Contact Extractor — Emails, Phones, Socials

haketa/email-extractor

Email and contact extractor API: crawl any website or URL list and export emails, phone numbers and social profiles (LinkedIn, Instagram, Facebook, X, YouTube, TikTok). Lead generation, contact enrichment and sales outreach data — fast, no login.

Haketa

Website Email & Contact Extractor: Lead Generation Tool

📧 Website Email & Contact Extractor: Lead Generation Tool

📌 Table of Contents

🔍 What Is This Actor?

🚀 Why Choose This Website Email & Contact Extractor?

🎯 Use Cases

📬 B2B Lead Generation

🏢 Sales Prospecting & CRM Building

📣 Marketing & Outreach Campaigns

🔍 Competitor Research

🤝 Partnership & Sponsorship Outreach

💼 Recruitment & HR

🌍 Local Business Data Collection

📊 Market Research & Data Enrichment

📊 What Data Is Extracted?

✉️ Email Addresses

📞 Phone Numbers

🔗 Social Media Profile Links

🏠 Physical Address

🏷️ Company Name

⚙️ How It Scans Each Website

Phase 1 — Homepage Scan

Phase 2 — Contact Page Discovery

Phase 3 — Sub-Page Scanning

Phase 4 — Deduplication & Assembly

Smart Filtering

📋 Output Fields (Full Reference)

Social Links Object Structure

⚙️ Input Parameters

📦 Example Inputs & Outputs

Example 1: Extract Contact from a Single Website

Example 2: Bulk Lead Generation — B2B List

Example 3: Keyword-Filtered Extraction

Example 4: Homepage-Only Quick Scan

🔍 Keyword Filtering

🔗 Social Media Link Extraction

🏠 Address Extraction

🏷️ Status Codes Explained

🌐 Proxy Configuration

Recommended Setup

When to Use Proxy

When Proxy Is Optional

⚡ Performance & Rate Limits

Speed Benchmarks

Reliability Features

❓ FAQ

📜 Changelog

v1.0.0 (Current)

⚖️ Legal & Terms of Use

🤝 Support & Feedback

You might also like

Website Contact Extractor

Contact Info Extractor

Email & Contact Extractor

Website Contact & Email Extractor

Website Contact Extractor - Emails, Phones & Social Links

Contact Details Scraper – Emails, Phone Numbers & Social Media

Email & Phone Extractor

📧 Website Email Extractor — Bulk Contact Scraper

Website Email, Phone Numbers & Social Media Profiles Finder

Email & Contact Extractor — Emails, Phones, Socials