Contact Info Scraper Pro avatar

Contact Info Scraper Pro

Pricing

from $1.00 / 1,000 results

Go to Apify Store
Contact Info Scraper Pro

Contact Info Scraper Pro

Crawl any website and extract emails, phones, and social media profiles. Pro: domain allow/blocklists, role-prefix exclusion (info@/support@), per-site dedup, first.last format filter, prioritises /contact, /about, /team. HTTP-only.

Pricing

from $1.00 / 1,000 results

Rating

5.0

(21)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

22

Bookmarked

3

Total users

1

Monthly active users

11 days ago

Last modified

Share

Crawl websites and extract emails, phone numbers, and social media profiles. The actor is HTTP-first and works well on server-rendered pages, static sites, and contact pages that expose data directly in HTML.

What this actor extracts

Per-domain record with:

  • websiteUrl - normalized start URL
  • domain - extracted hostname
  • emails - unique list of cleaned emails
  • phones - unique list of cleaned phone numbers
  • socialLinks - first discovered profile URL per supported platform
  • contactPageUrl - direct contact-page link when discovered
  • pagesCrawled
  • crawledAt

Supported social platforms include Facebook, Instagram, LinkedIn, X/Twitter, YouTube, TikTok, Pinterest, GitHub, Threads, Telegram, Discord, Mastodon, WhatsApp, and Reddit.

Input

FieldTypeDefaultDescription
urlsstring[]["https://apify.com"]Websites to scan. Empty-input cloud runs use this default.
maxPagesPerSiteinteger20Max pages to crawl per domain.
maxConcurrencyinteger5Parallel fetches per site.
requestTimeoutSecsinteger15HTTP timeout per page.
useProxybooleanfalseEnable Apify proxy for sites that block datacenter IPs.
smartPrioritisebooleantrueVisit likely contact/about/team pages first.
maxDepthinteger2Maximum link depth from the start URL.
domainAllowlistarray[]Pro filter: only crawl matching hosts.
domainBlocklistarray[]Pro filter: skip matching hosts.
excludeRolePrefixesarray[]Pro filter: drop emails whose local part starts with any of these prefixes.
requireFirstAndLastNamebooleanfalsePro filter: keep only personal-looking dotted names such as jane.doe@example.com.
outputDedupByenumdomainPro dedup mode: domain, email, or none.

Example input

{
"urls": ["https://example.com", "https://another.com"],
"maxPagesPerSite": 10,
"smartPrioritise": true,
"excludeRolePrefixes": ["info", "support", "hello"],
"requireFirstAndLastName": true,
"outputDedupBy": "email"
}

Output

One record per domain by default, or one record per email when outputDedupBy=email.

{
"recordType": "contact_domain",
"websiteUrl": "https://example.com",
"domain": "example.com",
"emails": ["jane.doe@example.com", "john.smith@example.com"],
"phones": ["+1 (555) 867-5309"],
"socialLinks": {
"linkedin": "https://www.linkedin.com/company/example",
"twitter": "https://x.com/example",
"instagram": "https://www.instagram.com/example"
},
"contactPageUrl": "https://example.com/contact",
"pagesCrawled": 7,
"crawledAt": "2026-04-30T14:00:00Z"
}

Empty fields are omitted. recordType is contact_domain for per-domain output and contact_email for per-email output.

Reliability notes

  • No browser is required for the standard flow.
  • The actor extracts emails from visible text, mailto: links, Cloudflare data-cfemail, and simple [at] / [dot] obfuscation.
  • Placeholder values from form fields are intentionally ignored.
  • If every discovered value is filtered out, the actor still finishes cleanly and sets a status message instead of pushing placeholder error rows.

FAQ

Do I need a proxy?
Usually no. Enable useProxy for sites that return 403 or otherwise block cloud IPs.

Does it render JavaScript?
Not in the HTTP-first path. It works best on pages that expose contact data directly in HTML.

Why are some emails missing?
Some sites only reveal them after client-side rendering, login, or anti-bot challenges. Increasing crawl depth can help for multi-page sites, but JS-only contact data may still stay hidden.

What does requireFirstAndLastName do?
It keeps dotted personal-style addresses like jane.doe@company.com and drops generic role addresses such as info@, hello@, or support@.

What happens when emails are filtered out?
The actor can still emit phones, socials, and contactPageUrl. Fields with no data are omitted.