Lead Finder: Email + Name Extraction avatar

Lead Finder: Email + Name Extraction

Pricing

from $5.00 / 1,000 lead results

Go to Apify Store
Lead Finder: Email + Name Extraction

Lead Finder: Email + Name Extraction

Lead Finder: Email + Name Extraction is a fast, lightweight Apify actor that extracts emails and related names from websites. It supports single URLs or domain crawling, handles obfuscated and protected emails, and offers flexible controls for deduplication, validation, and crawl behaviour.

Pricing

from $5.00 / 1,000 lead results

Rating

0.0

(0)

Developer

Datavault

Datavault

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

25 days ago

Last modified

Share

Lead Finder - Email + Name Extraction Actor

Lead Finder is a lightweight Dart actor for Apify that extracts emails (and associated names when available) from one or more URLs. You can keep it strictly to the provided URLs or enable full domain crawling. The actor charges per page load and per lead result.

What It Extracts

  • Emails from page text and HTML.
  • mailto: links (with name extraction).
  • Obfuscated emails like name (at) domain (dot) com.
  • Cloudflare email protection (data-cfemail + email-protection scripts).
  • vCard blocks with EMAIL, FN, and N.
  • Optional name detection from nearby DOM context (parent/sibling elements).
  • Next.js data API extraction for the current page (build ID resolved from page or homepage).

Input

  • startUrls: Array of URLs to start from.
  • crawlDomain: If true, follow internal links on the same domain. Default: false.
  • maxPagesPerCrawl: Maximum pages to visit. Default: 100.
  • maxConcurrency: Parallel workers. Default: 5.
  • maxRetries: Retries per failed request. Default: 3.
  • minRequestDelay: Delay between requests in ms. Default: 1000.
  • allowSubdomains: If true, allow subdomains. Default: false.
  • enableSkipping: Enable skip patterns. Default: true.
  • skipPatterns: URL substrings to skip. Default: cart/checkout/login/etc.
  • dedupeByDomain: Keep only one lead per email domain across the run. Default: false.
  • dedupeByEmail: Keep only one lead per email across the run. Default: false.
  • validateEmails: Apply stricter email validation rules. Default: true.
  • followExternalLinks: Follow external homepage links (e.g., "Hemsida"). Default: false.
  • maxExternalLinksPerPage: Max external links to follow per page. Default: 2.
  • followHomepageOnly: If true, only follow links labeled as homepage/website. Default: true.
  • fetchNextDataApi: Fetch Next.js data API for the current page. Default: true.
  • nextDataDeepMode: Try additional Next.js data routes derived from the URL path. Default: false.
  • maxNextDataCandidates: Max Next.js data URL candidates to try. Default: 4.
  • proxyConfiguration: Apify Proxy configuration.

Sample Input

{
"startUrls": [
{ "url": "https://www.example.com" },
"https://www.example.org/contact"
],
"crawlDomain": true,
"maxPagesPerCrawl": 200,
"maxConcurrency": 5,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

Output

Each dataset item is a lead:

  • email: Extracted email.
  • name: Optional associated name.
  • domain: Email domain (e.g., example.com).
  • sourceUrl: Page where the email was found.
  • sourceDomain: Domain of the source URL.
  • sourceType: mailto, page, data-attr, cloudflare, vcard, or script.

Tips

  • Start with crawlDomain: false and a single URL to validate results quickly.
  • Use maxPagesPerCrawl and dedupeByDomain to control costs and output size.