Website Contact Scraper
Pricing
from $150.00 / 1,000 website-scanneds
Website Contact Scraper
Extract emails, phone numbers, team member names, job titles, and social media links (LinkedIn, Twitter, Facebook) from any business website. Crawls contact, about, and team pages automatically. Batch process hundreds of URLs.
Pricing
from $150.00 / 1,000 website-scanneds
Rating
5.0
(1)
Developer

ryan clinton
Actor stats
1
Bookmarked
217
Total users
122
Monthly active users
2 days ago
Last modified
Categories
Share
Extract emails, phone numbers, team member names, job titles, and social media links from any business website. Provide a list of URLs and get back structured contact data ready for CRM import, lead generation, or outreach campaigns.
Website Contact Scraper crawls each website's homepage plus its contact, about, and team pages to discover every available piece of contact information. Results are deduplicated and aggregated per domain, so you get one clean record per website regardless of how many pages were scraped.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| Email addresses | mailto links, page body, anchor hrefs | sales@acmecorp.com |
| Phone numbers | tel: links, contact areas (footer, address blocks) | +1 (415) 555-0192 |
| Team member names | Schema.org Person, team cards, heading pairs | Jane Smith |
| Job titles | Adjacent text to names, structured data | Chief Executive Officer |
| LinkedIn profiles | Company and personal profile links | linkedin.com/company/acme |
| Twitter/X profiles | Profile links | twitter.com/acmecorp |
| Facebook pages | Page links | facebook.com/acmecorp |
| Instagram profiles | Profile links | instagram.com/acmecorp |
| YouTube channels | Channel, user, and @ links | youtube.com/@acmecorp |
Why use Website Contact Scraper?
Building prospect lists from company websites by hand is slow, error-prone, and doesn't scale. This actor automates the entire process: give it a list of URLs and it visits each site's homepage, discovers contact/about/team pages, extracts emails, phones, names, titles, and social profiles, deduplicates everything, and returns one clean structured record per domain.
It handles batch processing of hundreds of websites concurrently, filters junk addresses automatically, and outputs data ready for direct CRM import.
Built on the Apify platform, Website Contact Scraper gives you capabilities you won't get from a simple script:
- Scheduling — run daily or weekly to keep your contact database fresh
- API access — trigger runs programmatically from Python, JavaScript, or any HTTP client
- Proxy rotation — scrape at scale without IP blocks using Apify's built-in proxy infrastructure
- Monitoring — get notified when runs fail or produce unexpected results
- Integrations — connect directly to Zapier, Make, Google Sheets, or webhooks
Features
- Email extraction from mailto links, page body text, and anchor href attributes with automatic deduplication and junk email filtering (removes noreply, test, admin, and placeholder addresses)
- Phone number extraction from tel: links and formatted numbers in contact areas (footer, address blocks, elements with contact/phone classes), with validation to reject fake or sequential numbers
- Social media profile discovery for LinkedIn, Twitter/X, Facebook, Instagram, and YouTube with URL normalization
- Team member identification using three strategies: Schema.org Person markup, common team card CSS patterns (
.team-member,.staff-member,.bio-card, etc.), and heading-paragraph pairs matching name + job title patterns - Smart page discovery that automatically follows links to contact, about, team, leadership, and company pages within the same domain
- Configurable crawl depth with per-domain page limits (1 to 20 pages) to balance thoroughness against cost
- Proxy support for scraping at scale without getting blocked
- Batch processing of multiple websites in a single run with concurrent crawling (up to 10 simultaneous requests)
Use cases for scraping website contacts
Sales prospecting
SDRs building targeted prospect lists scrape company websites for decision-maker emails, direct phone numbers, and LinkedIn profiles before launching outreach sequences.
Marketing agency lead lists
Compile client contact databases from lists of business websites to support email marketing, cold calling, or ABM campaigns.
Recruiting and talent sourcing
Extract team pages from target companies to identify hiring managers, department heads, and their direct contact details.
Business research
Gather structured contact data from hundreds of company websites for market mapping, competitive analysis, or industry reports.
Freelancer outreach
Find the right person to pitch at prospective client companies by scraping about and leadership pages for names, titles, and email addresses.
Data enrichment
Augment existing company records with fresh contact details, social profiles, and team member information scraped directly from company websites.
How to scrape website contact information
- Provide website URLs — Enter one or more business website URLs in the input form. Each URL should be the homepage of a website you want to extract contacts from (e.g.,
https://example.com). - Configure options — Set the maximum pages per domain (default is 5, which covers most contact/about/team pages). Enable or disable name extraction and social link extraction depending on your needs.
- Run the actor — Click "Start" to begin the scrape. The actor will crawl each website's homepage, discover contact-related subpages, and extract all available contact information.
- Download results — Once finished, download your data as JSON, CSV, or Excel from the Dataset tab. Each row contains one website's complete contact profile with emails, phones, team members, and social links.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
urls | string[] | Yes | — | List of business website URLs to scrape. One result per domain. |
maxPagesPerDomain | integer | No | 5 | Maximum pages to crawl per website (1-20). Higher values find more contacts but use more compute. |
includeNames | boolean | No | true | Extract people's names and job titles from team and about pages. |
includeSocials | boolean | No | true | Extract social media profile links (LinkedIn, Twitter, Facebook, Instagram, YouTube). |
proxyConfiguration | object | No | Apify Proxy | Proxy settings for the crawl. Recommended when scraping many sites. |
Input examples
Scrape a single website with defaults:
{"urls": ["https://acmecorp.com"]}
Batch scrape with deep crawl:
{"urls": ["https://acmecorp.com","https://betaindustries.com","https://gammatech.io"],"maxPagesPerDomain": 15,"includeNames": true,"includeSocials": true}
Emails and phones only (no names or socials):
{"urls": ["https://example.com", "https://competitor.com"],"maxPagesPerDomain": 3,"includeNames": false,"includeSocials": false}
Input tips
- Start with 5 pages per domain — this covers homepage + contact + about + team in most cases. Only increase if sites have deeply nested staff directories.
- Enable proxies for batches over 20 sites — Apify Proxy prevents rate limiting and IP blocks automatically.
- Use exact homepages — provide
https://acmecorp.com, not deep URLs. The actor discovers subpages automatically.
Output example
Each item in the output dataset represents one website domain:
{"url": "https://acmecorp.com","domain": "acmecorp.com","emails": ["hello@acmecorp.com","sales@acmecorp.com","j.smith@acmecorp.com"],"phones": ["+1 (415) 555-0192","+1 800-555-0134"],"contacts": [{"name": "Jane Smith","title": "Chief Executive Officer","email": "j.smith@acmecorp.com"},{"name": "David Chen","title": "VP of Sales"},{"name": "Maria Rodriguez","title": "Head of Marketing"}],"socialLinks": {"linkedin": "https://www.linkedin.com/company/acmecorp","twitter": "https://twitter.com/acmecorp","facebook": "https://www.facebook.com/acmecorp","youtube": "https://www.youtube.com/@acmecorp"},"pagesScraped": 4,"scrapedAt": "2026-02-10T14:32:18.456Z"}
Output fields
| Field | Type | Description |
|---|---|---|
url | string | The normalized input URL (https, no trailing slash) |
domain | string | Domain name with www. stripped (e.g., acmecorp.com) |
emails | string[] | Deduplicated email addresses found across all crawled pages, with junk addresses filtered out |
phones | string[] | Deduplicated phone numbers from tel: links and formatted numbers in contact areas |
contacts | object[] | Named contacts with name (string), optional title (string), and optional email (string) |
socialLinks | object | Social media profile URLs keyed by platform: linkedin, twitter, facebook, instagram, youtube |
pagesScraped | number | Total pages processed for this domain (homepage + subpages) |
scrapedAt | string | ISO 8601 timestamp of when the scrape completed |
How much does it cost to scrape website contacts?
Website Contact Scraper uses pay-per-event pricing — you pay $0.15 per website scanned. Platform usage costs are included in the price.
| Scenario | Websites | Cost per website | Total cost |
|---|---|---|---|
| Quick test | 1 | $0.15 | $0.15 |
| Small batch | 10 | $0.15 | $1.50 |
| Medium batch | 50 | $0.15 | $7.50 |
| Large batch | 200 | $0.15 | $30.00 |
| Enterprise | 1,000 | $0.15 | $150.00 |
You can set a maximum spending limit per run to control costs. The actor will stop delivering results once your budget is reached, so you never pay more than you expect.
Extract website contacts using the API
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("ryanclinton/website-contact-scraper").call(run_input={"urls": ["https://acmecorp.com","https://betaindustries.com",],"maxPagesPerDomain": 5,"includeNames": True,"includeSocials": True,})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{item['domain']}: {len(item['emails'])} emails, {len(item['phones'])} phones")for contact in item.get("contacts", []):print(f" - {contact['name']}: {contact.get('title', 'N/A')}")
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const run = await client.actor("ryanclinton/website-contact-scraper").call({urls: ["https://acmecorp.com","https://betaindustries.com",],maxPagesPerDomain: 5,includeNames: true,includeSocials: true,});const { items } = await client.dataset(run.defaultDatasetId).listItems();for (const item of items) {console.log(`${item.domain}: ${item.emails.length} emails, ${item.phones.length} phones`);}
cURL
# Start the actor runcurl -X POST "https://api.apify.com/v2/acts/ryanclinton~website-contact-scraper/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"urls": ["https://acmecorp.com"],"maxPagesPerDomain": 5,"includeNames": true,"includeSocials": true}'# Fetch results (replace DATASET_ID from the run response)curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How the website contact scraper works
The actor runs a three-phase pipeline for each input URL:
Phase 1: Homepage crawl and link discovery
For each URL, the actor normalizes it (forces HTTPS, strips trailing slashes and www.), creates an empty result object, and crawls the homepage. During the homepage visit it immediately extracts emails, phones, socials, and contacts from the page HTML. It also scans every <a> link on the page for same-domain URLs matching 15 contact-related keywords (contact, about, team, leadership, management, etc.) and enqueues those subpages for crawling.
Phase 2: Subpage crawling
Discovered contact, about, team, and leadership pages are crawled up to the maxPagesPerDomain limit. Each subpage goes through the same extraction pipeline. A per-domain page counter prevents exceeding the limit. The CheerioCrawler runs at up to 10 concurrent connections with a 120 requests/minute rate limit, 30-second timeout per page, and 2 automatic retries on failure.
Phase 3: Result aggregation
After all pages are crawled, the actor produces one output record per domain. All extracted data is deduplicated during the merge step: emails by exact string match, phones by exact string match, contacts by case-insensitive name match, and social links by platform (first match per platform wins). The aggregated record is pushed to the Apify dataset.
Extraction strategies
Emails are found from three sources: (1) mailto: link hrefs, (2) regex matches in the full page body text, and (3) regex matches in all anchor href attributes. A junk filter removes noreply, test, admin, webmaster, postmaster, mailer-daemon, root addresses, and emails ending in image/CSS/JS file extensions or from known placeholder domains.
Phones come from two sources: (1) tel: link hrefs (most reliable), and (2) regex patterns in contact-specific page areas (footer, address, elements with contact/phone classes). Numbers must have 7-15 digits, proper formatting (international prefix, parentheses, or dash/dot separators), and pass validation (not all-same-digit, not sequential like 1234567).
Social links are extracted by scanning all anchor hrefs against platform-specific URL patterns. Only the first match per platform is kept.
Contact names use three strategies in order: (1) Schema.org Person structured data ([itemtype*="schema.org/Person"]), (2) 11 team card CSS class selectors (.team-member, .team-card, .staff-member, .person-card, .member-card, .leadership-card, .employee, .bio-card, .team-item, .people-card, .about-member), and (3) heading-paragraph pairs where the heading matches a strict name regex (/^[A-Z][a-z]+(?:\s[A-Z][a-z]+){1,3}$/) and the following paragraph contains a job title keyword.
Extraction reference
Junk email filters
| Pattern | Examples blocked |
|---|---|
noreply@, no-reply@ | noreply@company.com |
test@, example@ | test@company.com |
admin@, webmaster@ | admin@company.com |
postmaster@, mailer-daemon@, root@ | postmaster@company.com |
File extensions (.png, .jpg, .css, .js) | logo@company.png |
| Placeholder domains | user@example.com, user@sentry.io |
Social media platforms detected
| Platform | URL pattern matched |
|---|---|
linkedin.com/company/* or linkedin.com/in/* | |
| Twitter/X | twitter.com/* or x.com/* |
facebook.com/* | |
instagram.com/* | |
| YouTube | youtube.com/c/*, /channel/*, /user/*, /@* |
Contact page keywords
The actor follows same-domain links containing any of these path segments: contact, contact-us, get-in-touch, reach-us, reach-out, about, about-us, who-we-are, team, our-team, the-team, people, staff, leadership, management, executives, company, our-company.
Job title keywords (35+)
Used to identify text next to names as job titles: CEO, CTO, CFO, COO, CMO, VP, President, Director, Manager, Head, Lead, Chief, Founder, Co-Founder, Owner, Partner, Principal, Engineer, Developer, Designer, Analyst, Consultant, Specialist, Coordinator, Administrator, Officer, Executive, Supervisor, Associate, Assistant, Intern, Sales, Marketing, Operations, Finance, Human Resources.
Tips for best results
-
Start with a small
maxPagesPerDomainvalue. The default of 5 pages covers most contact, about, and team pages. Only increase it if you know the site has deeply nested team directories. -
Enable proxies for large batches. When scraping more than 20 websites, use Apify Proxy to avoid rate limiting and IP blocks. The built-in proxy rotation handles this automatically.
-
Filter results by email domain. The output may include third-party emails (e.g., from embedded forms or partner mentions). Post-process results to keep only emails matching the scraped domain.
-
Combine with Email Pattern Finder for deeper coverage. If the scraper finds team member names but not their emails, use Email Pattern Finder to predict email addresses based on the company's naming conventions.
-
Use CSV export for CRM import. Download results as CSV and import directly into HubSpot, Salesforce, or other CRMs. The flat structure of emails and phones maps cleanly to CRM contact fields.
-
Disable name extraction for speed. If you only need emails and phones, set
includeNamesto false. Name extraction adds processing time per page for the DOM traversal.
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Email Pattern Finder | Find team member names but no emails? Use Email Pattern Finder to predict addresses from the company's naming convention |
| B2B Lead Qualifier | Score and qualify the contacts you find based on company data and tech stack signals |
| B2B Lead Gen Suite | Orchestrate Website Contact Scraper with multiple enrichment sources for comprehensive lead generation |
| Bulk Email Verifier | Verify extracted emails before outreach to reduce bounce rates |
| HubSpot Lead Pusher | Push scraped contacts directly into HubSpot CRM as new contacts |
| Website Tech Stack Detector | Identify what technologies a company uses for technographic lead scoring |
| WHOIS Domain Lookup | Look up domain registration details for additional company intel |
Limitations
- No JavaScript rendering — uses CheerioCrawler (server-side HTML parsing). SPAs that load contacts via client-side JavaScript will not have their dynamic content extracted.
- Same-domain only — only follows links within the same domain as the input URL. Cross-domain team pages or external about pages are not discovered.
- Name extraction accuracy varies — team member detection depends on structured HTML patterns (Schema.org, team cards, heading pairs). Custom or unusual page layouts may not trigger extraction.
- No authentication — only processes publicly accessible pages. Login-gated contact directories are not supported.
- One record per domain — multiple input URLs on the same domain are merged into a single output record.
- Phone extraction limited to contact areas — to minimize false positives, phone regex only runs against footer, address, and elements with contact/phone classes (not the full page body).
- First social link per platform — if a page has multiple LinkedIn profiles (e.g., company + employee), only the first match is captured.
Integrations
- Zapier — Trigger a Zap when new contact data is scraped. Push emails and phones directly to your CRM, email marketing platform, or spreadsheet.
- Make — Build automated workflows that feed scraped contacts into HubSpot, Mailchimp, Salesforce, or any of Make's 1,500+ app integrations.
- Google Sheets — Export results directly to Google Sheets for collaborative review.
- Apify API — Call the actor programmatically. Start runs, poll for completion, and download results in JSON, CSV, XML, or Excel format.
- Webhooks — Receive notifications when a run completes, then automatically process results in your backend.
- LangChain / LlamaIndex — Feed contact data into AI agent workflows for automated outreach or research.
Responsible use
- This actor only accesses publicly visible web pages.
- Respect website terms of service and
robots.txtdirectives. - Comply with GDPR, CAN-SPAM, and other applicable data protection laws when using scraped contact data for outreach.
- Do not use extracted personal contact information for spam, harassment, or unauthorized purposes.
- For guidance on web scraping legality, see Apify's guide.
FAQ
How many websites can I scrape in one run? There is no hard limit on the number of URLs. The actor processes them concurrently (up to 10 at a time) and enforces per-domain page limits. A run of 1,000 websites typically completes within 40-60 minutes.
Does this actor extract emails hidden behind JavaScript? No. Website Contact Scraper uses CheerioCrawler, which parses static HTML. If a website loads contact details via client-side JavaScript (e.g., React/Angular SPAs), those emails will not be found. For JavaScript-heavy sites, consider using a Playwright-based scraper.
What types of email addresses are filtered out? The actor automatically removes noreply@, test@, admin@, webmaster@, postmaster@, mailer-daemon@, and root@ addresses. It also filters out emails ending in image/CSS/JS file extensions and addresses from known junk domains like @example.com and @sentry.io.
Is it legal to scrape contact information from websites? Scraping publicly available contact information from websites is generally legal in most jurisdictions. However, laws vary by country. In the EU, GDPR restricts how you can use personal data, so ensure you have a lawful basis before processing scraped contact data. In the US, the CFAA and CAN-SPAM Act impose restrictions on automated access and unsolicited emails. Always review the target website's Terms of Service and consult legal counsel for your specific use case.
How accurate is the team member extraction? Name extraction works best on websites with structured team pages using common HTML patterns (team cards, Schema.org Person markup, heading-paragraph pairs). Accuracy varies by website design. The actor uses strict name validation to minimize false positives — it requires proper capitalization (e.g., "Jane Smith") and filters out 40+ common non-name words.
Can I schedule this actor to run periodically? Yes. Use Apify Schedules to run the actor daily, weekly, or at any custom interval. This is useful for monitoring changes to company contact pages or keeping your CRM data fresh.
What social media platforms are supported? The actor extracts links for LinkedIn (company and personal profiles), Twitter/X, Facebook, Instagram, and YouTube (channels and user profiles).
Why are some phone numbers missing from the output? Phone extraction intentionally targets only contact-specific page areas (footer, address elements, elements with contact/phone CSS classes) and requires proper formatting (international prefix, parentheses, or dash/dot separators). This strict approach minimizes false positives from random digit sequences but may miss numbers placed outside standard contact areas.
How is this different from other contact scrapers? Website Contact Scraper goes beyond simple email extraction. It combines three extraction strategies for team members, follows contact/about/team pages automatically, filters junk data, and aggregates everything into one record per domain. It also supports batch processing of hundreds of sites concurrently.
Can I use this with my own proxies?
Yes. Pass your proxy configuration in the proxyConfiguration input field. The actor supports Apify Proxy (default), custom proxy URLs, and proxy groups.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom scraping solutions or enterprise integrations, reach out through the Apify platform.