Contact Info Scraper Pro
Pricing
from $1.00 / 1,000 results
Contact Info Scraper Pro
Crawl any website and extract emails, phones, and social media profiles. Pro: domain allow/blocklists, role-prefix exclusion (info@/support@), per-site dedup, first.last format filter, prioritises /contact, /about, /team. HTTP-only.
Pricing
from $1.00 / 1,000 results
Rating
5.0
(21)
Developer
Crawler Bros
Maintained by CommunityActor stats
22
Bookmarked
3
Total users
1
Monthly active users
11 days ago
Last modified
Categories
Share
Crawl websites and extract emails, phone numbers, and social media profiles. The actor is HTTP-first and works well on server-rendered pages, static sites, and contact pages that expose data directly in HTML.
What this actor extracts
Per-domain record with:
websiteUrl- normalized start URLdomain- extracted hostnameemails- unique list of cleaned emailsphones- unique list of cleaned phone numberssocialLinks- first discovered profile URL per supported platformcontactPageUrl- direct contact-page link when discoveredpagesCrawledcrawledAt
Supported social platforms include Facebook, Instagram, LinkedIn, X/Twitter, YouTube, TikTok, Pinterest, GitHub, Threads, Telegram, Discord, Mastodon, WhatsApp, and Reddit.
Input
| Field | Type | Default | Description |
|---|---|---|---|
urls | string[] | ["https://apify.com"] | Websites to scan. Empty-input cloud runs use this default. |
maxPagesPerSite | integer | 20 | Max pages to crawl per domain. |
maxConcurrency | integer | 5 | Parallel fetches per site. |
requestTimeoutSecs | integer | 15 | HTTP timeout per page. |
useProxy | boolean | false | Enable Apify proxy for sites that block datacenter IPs. |
smartPrioritise | boolean | true | Visit likely contact/about/team pages first. |
maxDepth | integer | 2 | Maximum link depth from the start URL. |
domainAllowlist | array | [] | Pro filter: only crawl matching hosts. |
domainBlocklist | array | [] | Pro filter: skip matching hosts. |
excludeRolePrefixes | array | [] | Pro filter: drop emails whose local part starts with any of these prefixes. |
requireFirstAndLastName | boolean | false | Pro filter: keep only personal-looking dotted names such as jane.doe@example.com. |
outputDedupBy | enum | domain | Pro dedup mode: domain, email, or none. |
Example input
{"urls": ["https://example.com", "https://another.com"],"maxPagesPerSite": 10,"smartPrioritise": true,"excludeRolePrefixes": ["info", "support", "hello"],"requireFirstAndLastName": true,"outputDedupBy": "email"}
Output
One record per domain by default, or one record per email when outputDedupBy=email.
{"recordType": "contact_domain","websiteUrl": "https://example.com","domain": "example.com","emails": ["jane.doe@example.com", "john.smith@example.com"],"phones": ["+1 (555) 867-5309"],"socialLinks": {"linkedin": "https://www.linkedin.com/company/example","twitter": "https://x.com/example","instagram": "https://www.instagram.com/example"},"contactPageUrl": "https://example.com/contact","pagesCrawled": 7,"crawledAt": "2026-04-30T14:00:00Z"}
Empty fields are omitted. recordType is contact_domain for per-domain output and contact_email for per-email output.
Reliability notes
- No browser is required for the standard flow.
- The actor extracts emails from visible text,
mailto:links, Cloudflaredata-cfemail, and simple[at]/[dot]obfuscation. - Placeholder values from form fields are intentionally ignored.
- If every discovered value is filtered out, the actor still finishes cleanly and sets a status message instead of pushing placeholder error rows.
FAQ
Do I need a proxy?
Usually no. Enable useProxy for sites that return 403 or otherwise block cloud IPs.
Does it render JavaScript?
Not in the HTTP-first path. It works best on pages that expose contact data directly in HTML.
Why are some emails missing?
Some sites only reveal them after client-side rendering, login, or anti-bot challenges. Increasing crawl depth can help for multi-page sites, but JS-only contact data may still stay hidden.
What does requireFirstAndLastName do?
It keeps dotted personal-style addresses like jane.doe@company.com and drops generic role addresses such as info@, hello@, or support@.
What happens when emails are filtered out?
The actor can still emit phones, socials, and contactPageUrl. Fields with no data are omitted.