Pricing

from $0.01 / 1,000 results

Contact Details Extractor

The cheapest contact scraper on Apify. Extract emails, phone numbers, company names, addresses & 25+ social profiles at $0.001/page - 50% less than competitors. Smart crawling auto-finds contact pages, bypasses Cloudflare protection, browser mode for JS sites, sitemap discovery.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

kata Kuri

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Why this scraper

	This actor
Cleans output	Validates emails (TLD whitelist, blacklist, multi-`@` reject), phones (libphonenumber E.164), and social URLs (rejects share buttons)
Per-domain merging	One row per domain instead of one row per page
25+ social platforms	LinkedIn, X/Twitter, Instagram, Facebook, YouTube, TikTok, Pinterest, GitHub, Discord, Telegram, WhatsApp, Reddit, Medium, Substack, Twitch, Snapchat, Threads, Bluesky, Mastodon, Spotify, Vimeo, Dribbble, Behance, SoundCloud, Crunchbase, AngelList
JS rendering on demand	Three modes: HTTP-only (cheapest), browser-only (always render), or auto (HTTP first, browser fallback when the page looks like an empty SPA shell)
Cloudflare email decoding	Decodes both `data-cfemail` attributes and `/cdn-cgi/l/email-protection#hex` URLs
Smart contact-page targeting	Crawl order is ranked by URL relevance — `/contact`, `/about`, `/team`, `/imprint` go first, blog posts last
Sitemap discovery	Optional `/sitemap.xml` and `/robots.txt` parsing to find contact-rich pages without crawling
Pay-per-event	You pay per record produced, not per page crawled — no charge when nothing useful is found

What gets extracted

Per domain (when mergeContacts: true, default):

{
  "domain": "hubertprocess.com",
  "url": "https://www.hubertprocess.com",
  "companyName": "Hubert Process",
  "companyDescription": "Hubert Process designs optical sorting machines…",
  "logo": "https://www.hubertprocess.com/apple-touch-icon.png",
  "emails": ["admin.si@hubertprocess.com", "contact@hubertprocess.com"],
  "phones": ["+33241487578", "+33243696298", "+41228203544"],
  "phonesUncertain": [],
  "addresses": [
    {
      "full": "1 Market St, San Francisco, CA, 94105, US",
      "street": "1 Market St",
      "city": "San Francisco",
      "region": "CA",
      "postalCode": "94105",
      "country": "US"
    }
  ],
  "linkedin": "https://www.linkedin.com/company/hubert-metal",
  "twitter": null,
  "facebook": "https://www.facebook.com/Hubert-Process-Robotique-102014175737316",
  "instagram": null,
  "youtube": null,
  "github": null,
  "// ... 19 other social platforms": "...",
  "scrapedUrls": ["...12 URLs..."],
  "scrapedAt": "2026-05-03T12:34:34Z"
}

Inputs

Field	Type	Default	Description
`startUrls`	array	required	Websites to scrape. Plain domains (`example.com`) and full URLs both work.
`maxPagesPerStartUrl`	int	20	Pages crawled per website. Lower = cheaper, faster.
`maxDepth`	int	2	Click-depth from the start URL. `1` = homepage only, `2` = homepage + linked pages.
`sameDomain`	bool	true	Only follow links on the same registered domain.
`useSitemap`	bool	false	Discover pages via `/sitemap.xml`.
`browserMode`	enum	`auto`	`off` (HTTP only), `on` (always browser), `auto` (HTTP first, browser fallback for SPAs).
`mergeContacts`	bool	true	Combine all pages of a domain into one record.
`extractAddresses`	bool	true	Parse postal addresses from schema.org markup.
`extractCompanyInfo`	bool	true	Detect company name, description, and logo.
`decodeCloudflareEmails`	bool	true	Decode CF-protected emails.
`phoneCountryHint`	string	`null`	ISO country code (`US`, `GB`, `FR`, …) for parsing local-format phones.
`maxConcurrency`	int	10	Parallel page fetches.
`proxyConfiguration`	object	`{useApifyProxy: true}`	Datacenter is the default; switch to `RESIDENTIAL` for Cloudflare-blocked sites.

How it works

Each start URL is normalized (example.com → https://example.com) and seeded into the HTTP queue.
Optionally, /sitemap.xml is parsed; the highest-ranking URLs (containing contact, about, imprint, etc.) are added to the queue.
The HTTP crawler (Cheerio) fetches pages with realistic headers. For each page:
- Run all extractors against the HTML and visible text.
- Extract outbound links, score them by contact-relevance, follow the highest-scoring ones until the per-domain budget is exhausted.
- In browserMode: auto, if the HTML looks like an empty SPA shell (React root with no content, very low text-to-HTML ratio), push the URL into the browser queue instead.
After the HTTP pass, the Playwright crawler renders the URLs that were flagged for fallback.
Pages are merged per registered domain (so blog.acme.co.uk and www.acme.co.uk collapse into one acme.co.uk record).
Each non-empty record fires a contact-record charge event (pay-per-event pricing).

What makes the extraction reliable

Email TLD whitelist. A naive regex would match contact@welko.contactez because contactez looks like a TLD. We reject TLDs not on the IANA root zone list. Result: zero false positives from non-English text.
Obfuscated email regex requires explicit markers. [at], (at), or whitespace-isolated AT — never bare at inside a word. Otherwise automation would match as autom@ion.
Phone validation via libphonenumber. Phone-shaped digit runs only land in phones when libphonenumber confirms they're real. Unverified candidates with separators land in phonesUncertain. Pure digit runs (SIRET numbers, tracking IDs, hashes) are dropped.
Social URLs reject share buttons. twitter.com/intent/tweet, linkedin.com/sharing/share-offsite, facebook.com/sharer.php all rejected. Only profile URLs make it through.
Cloudflare email decoder. Both data-cfemail="..." and /cdn-cgi/l/email-protection#... patterns are XOR-decoded inline.

Pricing

Pay-per-event — you're billed per successful page extracted, never for failed requests (4xx, timeouts, blocks). Exactly one of the page events fires per page, picked by which combination of renderer × proxy was used:

Event	Suggested price (Free)	Suggested price (Business)	When it fires
`actor-start`	$0.01 / run	$0.005 / run	Once at the start of every run
`page-scraped`	$1.00 / 1 000	$0.69 / 1 000	Plain HTTP page extracted (cheapest)
`page-with-browser`	$2.00 / 1 000	$1.50 / 1 000	Playwright-rendered page on datacenter proxy
`page-residential-proxy`	$3.00 / 1 000	$2.30 / 1 000	Any page fetched via residential proxy (overrides the two above)

These suggested prices match the competitor (betterdevsscrape/contact-details-extractor) so users can switch without re-budgeting.

What a typical run costs

Crawling 1 000 small B2B sites with default settings (maxPagesPerStartUrl: 20, browserMode: auto, datacenter proxy) typically uses:

1 × actor-start → $0.01
~16 000 successful page-scraped events (~80% HTTP success) → $16.00
~2 000 page-with-browser events (~10% needed JS rendering) → $4.00
Total: ~$20 per 1 000 sites — same ballpark as the competitor, with cleaner output.

If you switch to proxyConfiguration.apifyProxyGroups: ["RESIDENTIAL"] to bypass Cloudflare-protected sites:

All page events become page-residential-proxy → ~$54 per 1 000 sites
Still cheaper than running residential through betterdevsscrape ($3 / 1 000 there too) and you get more sites unlocked thanks to per-context warm-up.

Why this model is better than per-domain billing

The previous version charged once per domain with at least one piece of data. That sounds cheap until you realise it heavily penalised small jobs (one site = same cost as 100 pages of one site) and made it impossible to set per-page budgets in tools like Make/n8n. The per-page model is what every other contact-extractor on the Apify Store uses and what your customer is already mentally budgeting against.

Tips

Plain HTML sites (most B2B sites): keep browserMode: off — fastest and cheapest.
JS-heavy SPAs (Webflow, modern React apps): use browserMode: auto — it switches to browser only when needed.
Cloudflare-blocked sites (520, 403): switch proxyConfiguration to { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }.
For sales/lead-gen: enable useSitemap: true and bump maxPagesPerStartUrl: 50 — gets you the full team page on most company sites.

Output formats

The dataset is exportable as JSON, CSV, Excel, or HTML directly from the Apify console. CSV is the fastest path into HubSpot, Salesforce, Pipedrive, or any standard CRM importer.

Local development

git clone https://your-repo/contact-details-extractor.git
cd contact-details-extractor
npm install

# Run unit tests (31 cases — extractor logic, regex correctness, regression coverage)
npm test

# Run the actor locally against a test input
echo '{ "startUrls": [{"url": "https://www.apify.com"}], "maxPagesPerStartUrl": 8, "browserMode": "off", "proxyConfiguration": null }' > apify_storage/key_value_stores/default/INPUT.json
APIFY_LOCAL_STORAGE_DIR=$(pwd)/apify_storage node src/main.js

Roadmap

Smarter address extraction from free-form text (currently relies on schema.org markup)
Person-level contact extraction (job title + email pairing)
Optional WhatsApp/Telegram deep-link extraction (wa.me/<phone> patterns)

License

ISC

Website Contact Extractor

betterdevsscrape/contact-details-extractor

Better Devs Scrape

458

5.0

Contact Info Extractor

optimus-fulcria/contact-info-extractor

Extract emails, phone numbers, social media profiles, and addresses from any website. Auto-follows contact pages. Lead generation ready.

Fulcria Labs

Contact Details Scraper

poidata/contact-details-scraper

Extract emails, phone numbers, and social media profiles from any website. Simply provide domain names or page URLs - our tool automatically visits sites and finds contact information. Perfect for lead generation, contact database building, and finding business contact details quickly.

Poidata

243

Website Contact Extractor

mighty_monk/website-contact-extractor

Extract contact information (emails, phone numbers, social profiles, addresses) from websites by crawling homepage, contact, about, and footer links. One structured result per domain with CSV and Markdown export.

Harsh

Domain Contact Enrichment

toronto_777/domain-contact-enrichment

Extract public contact emails, phone links, contact pages, and social links from company websites.

Steven Feng

📇 Contact Details Scraper — Emails, Phones & Socials

inexhaustible_glass/contact-details-scraper

Give it any list of websites and get the contact details on each: emails, phone numbers & social profiles (LinkedIn, Twitter/X, Facebook, Instagram, YouTube, TikTok) + company name. Auto-finds Contact/About pages. B2B lead-gen. Free, no proxy, no blocks.

Hitman studio

Website Contact & Email Extractor

technicaldost/website-contact-extractor

Extract contact details from websites in bulk: emails, phone numbers and social profiles, plus the contact page. Turn a list of domains into a lead list.