Bulk Website Contact Extractor – Emails, Phones & Socials avatar

Bulk Website Contact Extractor – Emails, Phones & Socials

Pricing

from $5.00 / 1,000 result founds

Go to Apify Store
Bulk Website Contact Extractor – Emails, Phones & Socials

Bulk Website Contact Extractor – Emails, Phones & Socials

Extract emails, phone numbers, WhatsApp, and social media links from any list of URLs. Bulk lead enrichment with A–D lead scoring. Returns one CRM-ready record per URL.

Pricing

from $5.00 / 1,000 result founds

Rating

0.0

(0)

Developer

Khadin Akbar

Khadin Akbar

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Bulk Website Contact Extractor – Emails, Phones & Social Media Links

Extract emails, phone numbers, WhatsApp links, and social media profiles from any list of websites. Built for bulk lead enrichment, outreach automation, and CRM data pipelines. Returns one clean, structured record per URL with an A–D lead quality score.

Compatible with: Apify MCP Server (Claude, ChatGPT), LangChain, Make.com, Zapier, n8n, and direct REST API access.


What does the Bulk Website Contact Extractor do?

The Bulk Website Contact Extractor crawls any list of website URLs and automatically finds every publicly available contact detail on each site — emails, phone numbers, social media profile links, and WhatsApp contact links. It scans the homepage plus high-value sub-pages like /contact, /about, /team, and /imprint using priority-first crawling, then consolidates everything into one clean record per URL.

Unlike single-URL scrapers, this tool is designed from the ground up for bulk lead enrichment. Paste 500 domains from a spreadsheet, a CRM export, or a previous scraping run and get a fully enriched, scored contact list in minutes.

Every output record includes an A/B/C/D lead score so you can instantly triage your list: A-scored leads have both email and phone, B-scored have email only, C-scored have phone only, and D-scored had no public contact info found.


What data can you extract from websites?

FieldTypeExample
source_urlstringhttps://acme.com
domainstringacme.com
page_titlestringACME Corp – Contact Us
emailsstring[]["hello@acme.com", "sales@acme.com"]
phonesstring[]["+1-800-555-0100", "(555) 867-5309"]
social_links.linkedinstringhttps://linkedin.com/company/acme-corp
social_links.twitterstringhttps://x.com/acmecorp
social_links.facebookstringhttps://facebook.com/acmecorp
social_links.instagramstringhttps://instagram.com/acmecorp
social_links.youtubestringhttps://youtube.com/c/acmecorp
social_links.tiktokstringhttps://tiktok.com/@acmecorp
social_links.githubstringhttps://github.com/acmecorp
whatsapp_linksstring[]["https://wa.me/15551234567"]
lead_scoreA/B/C/DA (email + phone)
pages_crawledinteger4
used_playwrightbooleanfalse (true = JS-rendered site)
scraped_atISO datetime2026-03-30T12:00:00.000Z

How to scrape emails and phone numbers from websites in bulk

  1. Click "Try for free" to open the actor in Apify Console
  2. Add your list of website URLs in the "Website URLs to process" field — paste one URL per line or upload a CSV
  3. Set "Max pages per URL" (default 6 — the crawler prioritises contact/about pages automatically)
  4. Click "Start" and wait for results (most runs complete in under 60 seconds for 100 URLs)
  5. Download your enriched contact list as JSON, CSV, or Excel

How to use the Bulk Contact Extractor with AI agents (Claude, ChatGPT)

Connect via the Apify MCP Server and ask naturally:

"Extract all emails and phone numbers from these 50 company websites" "Enrich this URL list with contact info and lead scores" "Get the LinkedIn and Twitter links for every URL in my list"

The AI agent reads the actor's schema and runs it automatically, returning structured results you can feed directly into your workflow.


Lead scoring explained

Every record gets an automatic A–D quality score based on what contact information was found:

ScoreMeaningRecommended action
AEmail + phone foundHighest priority — add to outreach sequence immediately
BEmail onlyStrong lead — include in email campaigns
CPhone onlyInclude in calling campaigns
DNo contact infoLow priority — site may block crawlers or have no public contact page

Filter your exported results by lead_score = "A" to instantly isolate your best leads.


How much does bulk email and phone extraction cost?

This actor uses pay-per-event pricing — you pay only for URLs actually processed, never a flat fee.

VolumePrice per URL100 URLs1,000 URLs10,000 URLs
Free tier$0.005$0.50$5.00$50.00
Bronze+$0.004$0.40$4.00$40.00
Gold+$0.002$0.20$2.00$20.00

Every new Apify account includes $5 in free credits — enough to enrich around 1,000 URLs at no cost.


Use cases for bulk contact extraction

Lead Generation & Sales Prospecting — Enrich a list of target company websites with emails and phones before a cold outreach campaign. Filter by lead score to prioritise your highest-value prospects and cut manual research time from days to minutes.

CRM Data Enrichment — Take a CRM export of company domains and enrich it with verified contact emails, direct-dial phones, and social profiles. Import the results straight back into HubSpot, Salesforce, or Pipedrive via CSV.

Agency Client Research — Build prospect lists for digital marketing agencies by extracting contact info from directories, industry lists, or competitor client rosters at scale.

AI Pipeline Automation — Feed the structured output directly into Claude, ChatGPT, or a LangChain agent for downstream tasks: composing personalised outreach emails, scoring leads by industry signals, or routing contacts to different sequences based on social media presence.

Business Directory Building — Crawl a seed list of URLs and build a structured, searchable database of business contacts for a niche vertical — all enriched with emails, phones, and social profiles.

Social Media Mapping — For each URL in your list, get the LinkedIn company page, Twitter/X handle, Instagram, YouTube channel, TikTok, and GitHub organisation URL in one pass.


API & Integration

REST API

curl -X POST "https://api.apify.com/v2/acts/khadinakbar~bulk-website-contact-extractor/runs" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"startUrls": [
{"url": "https://acme.com"},
{"url": "https://globex.com"}
],
"maxPagesPerDomain": 6,
"maxResults": 500
}'

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('khadinakbar/bulk-website-contact-extractor').call({
startUrls: [
{ url: 'https://acme.com' },
{ url: 'https://globex.com' },
],
maxPagesPerDomain: 6,
maxResults: 500,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items); // Array of contact records

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
run = client.actor('khadinakbar/bulk-website-contact-extractor').call(
run_input={
'startUrls': [
{'url': 'https://acme.com'},
{'url': 'https://globex.com'},
],
'maxPagesPerDomain': 6,
'maxResults': 500,
}
)
items = list(client.dataset(run['defaultDatasetId']).iterate_items())
for item in items:
print(item['domain'], item['lead_score'], item['emails'])

Integrations: Apify MCP Server, LangChain, Make.com, Zapier, n8n, Google Sheets, HubSpot, Salesforce, Airtable


FAQ

Q: How is this different from other contact scrapers? A: Most contact scrapers work on one URL at a time. This actor is designed specifically for bulk lists — paste 500 URLs and get 500 enriched records in a single run. It also includes lead scoring, social media link extraction, and full MCP / AI agent support out of the box.

Q: Does it work on any website? A: It works on any publicly accessible website. The actor uses a two-phase engine: Phase 1 is a fast HTML parser (CheerioCrawler) that handles most sites instantly. If Phase 1 detects a JavaScript-heavy page — Next.js, React, Vue, Angular, Gatsby SPAs — it automatically switches to Phase 2: a full Playwright (Chromium) browser that renders the page completely before extracting. The used_playwright field in each result tells you which engine was used. Sites behind login walls or CAPTCHA cannot be crawled.

Q: How fast is the bulk extraction? A: Approximately 50–150 URLs per minute with the default settings and datacenter proxies, depending on site response times.

Q: What if a website blocks the crawler? A: The actor retries failed requests twice with rotating sessions. If a site actively blocks datacenter proxies, enable residential proxies in the advanced settings. Failed URLs are logged and still produce a D-scored record so you know which sites didn't yield data.

Q: Can I use this with Claude or ChatGPT? A: Yes — connect via the Apify MCP Server and ask in natural language. The actor's output schema is fully annotated so AI agents can understand and route every field without hallucination.

Q: Can I schedule recurring enrichment runs? A: Yes — use Apify's built-in scheduler to run daily, weekly, or at any custom interval. Results can be pushed to webhooks, Slack, email, or downstream apps automatically.

Q: Is it legal to scrape contact info from websites? A: This actor extracts only publicly available information that any visitor can access through a web browser. For guidance on web scraping legality, see Apify's blog post on the topic.

Q: What is the used_playwright field? A: It's a boolean that tells you whether the Chromium browser engine (Playwright) was used to render a page before extraction. Sites with used_playwright: true were JavaScript-heavy — they needed full browser rendering to expose their contact data. Sites with false were handled by the faster HTML-only parser. This is useful for debugging and understanding your input list's technical composition.


How Cloudflare email protection is handled

Many websites use Cloudflare's email obfuscation feature, which replaces email addresses in HTML with an encoded string like <a data-cfemail="hexstring">. Standard scrapers read the raw HTML and miss these emails entirely — they never see a valid address.

This actor automatically decodes Cloudflare-protected emails using the XOR algorithm Cloudflare uses internally. The first byte of the hex string is the XOR key; the remaining bytes are XOR'd against it to recover the original email address. This means emails protected by Cloudflare are extracted just as reliably as plainly written mailto: links.


How the dual-engine crawler works

The extraction pipeline runs in two sequential phases:

Phase 1 — CheerioCrawler (fast, lightweight) Every URL is first processed by a server-side HTML parser. This is extremely fast (no browser overhead), handles static sites and server-rendered pages (WordPress, Shopify, plain HTML), and supports up to 10 concurrent requests per run. It extracts emails, phones, social links, and WhatsApp URLs from the raw HTML source.

Phase 2 — Playwright fallback (full browser rendering) If Phase 1 detects a JavaScript-heavy page — by checking for Next.js data tags, React/Angular roots, Gatsby markers, or unusually thin body text — it flags the URL for Playwright. Phase 2 launches a full headless Chromium browser, waits for the JavaScript to settle (networkidle), then runs the same extraction logic on the fully rendered DOM. This recovers contact info from SPAs and JS-rendered pages that Phase 1 would miss.

The result: you get the speed benefits of a lightweight crawler for most sites, with automatic Chromium fallback for the sites that need it — all in a single run, with no configuration required.


Export & download options

Results are available in multiple formats immediately after the run completes:

  • JSON — structured records, ideal for API consumption and downstream automation
  • CSV — flat file, import directly into Excel, Google Sheets, HubSpot, Salesforce, or any CRM
  • Excel (.xlsx) — pre-formatted spreadsheet download from the Apify Console
  • JSONL — newline-delimited JSON, ideal for streaming to data pipelines

You can also retrieve results programmatically via the Apify Dataset API at any time after the run finishes.