Pricing

Pay per usage

Leadslogix Email Discovery

4-layer email discovery pipeline: Layer 0 (DNS/OSINT), Layer 1 (Site Crawl), Layer 2 (Multi-Engine Search), Layer 3 (Google Playwright). Plus 8-pattern email prediction engine.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Leadslogix LLC

Actor stats

Bookmarked

Total users

Monthly active users

15 hours ago

Last modified

Why This Actor

Most email finder tools rely on a single data source — typically a purchased database or a single search engine. LeadsLogix Email Discovery runs up to four independent discovery layers per domain, cross-references results, and assigns quality tiers so you know which emails are crawl-verified and which are predicted. The pipeline is designed for accuracy over volume: every discovered email is tagged with its source, confidence score, and quality grade.

Key Features

4-layer discovery pipeline -- passive DNS, site crawl, search engines, and Google Playwright (optional)
8-pattern email prediction engine -- generates likely email addresses from person names using common corporate patterns (firstname.lastname, f.last, first, etc.)
Noise filtering -- automatically removes noreply, postmaster, webmaster, abuse, mailer-daemon, and addresses from known noise domains (sentry.io, wixpress.com, googleapis.com, etc.)
Quality tiering -- every email is graded A (crawl-verified), B (DNS/search-discovered), or C (predicted) so you can prioritize outreach
Domain deduplication -- strips duplicate emails across layers and normalizes domains from URLs
Flexible input -- accepts CSV upload, Excel upload, public file URL, or inline JSON array
Dual output -- results pushed to Apify Dataset (queryable via API) and exported as CSV to Key-Value Store
Proxy support -- integrates with Apify Proxy (residential or datacenter) for search and crawl operations
Human-like behavior -- 1-second delays between page requests, standard browser user agents, respectful crawl patterns
Free tier included -- process up to 20 domains per run at no charge beyond Apify platform compute

How the Discovery Layers Work

The pipeline processes each domain sequentially through up to four layers. Later layers only run if earlier layers did not find sufficient results. Each layer has a different speed, risk, and confidence profile.

Layer 0 -- DNS/OSINT (Passive)

No HTTP requests to the target domain. Queries public DNS records only.

Check	What It Finds
DMARC `rua`/`ruf` records	Reporting email addresses published in `_dmarc.{domain}` TXT records
SPF `include` directives	Email addresses embedded in SPF TXT records
MX records	Mail exchange hosts (used for domain validation, not email extraction)

Speed: Near-instant (DNS queries only). Confidence: 60-70. Quality tier: B.

Layer 1 -- Site Crawl (Direct)

Crawls the target domain via httpx with a standard browser user agent. Visits these paths:

/  (homepage)
/contact
/contact-us
/about
/about-us
/team
/impressum

Both the root domain (https://domain.com) and the www subdomain (https://www.domain.com) are checked. Emails are extracted via regex and filtered to match the target domain only. A 1-second delay is inserted between each page request.

Speed: 8-15 seconds per domain (8 pages with delays). Confidence: 85. Quality tier: A.

Layer 2 -- Search Engine Discovery

Searches DuckDuckGo for the domain's published email addresses using two queries:

"{domain}" email contact
site:{domain} "@{domain}"

Emails are extracted from search result titles and snippets. Only emails containing the target domain are kept. A 2-second delay is inserted between queries to avoid rate limiting.

Speed: 5-10 seconds per domain. Confidence: 75. Quality tier: B.

Layer 3 -- Google Playwright (Optional, Skipped by Default)

Uses a headless Playwright browser to search Google directly. This layer is disabled by default because it carries higher CAPTCHA risk and requires more compute resources. Enable it only when Layers 0-2 return insufficient results.

Speed: 15-30 seconds per domain. Confidence: 80. Quality tier: A. Risk: Google may serve CAPTCHAs. Requires Apify Proxy (residential recommended).

Email Prediction Engine

When the actor has contact names but no discovered emails for a domain, the 8-pattern prediction engine generates likely email addresses:

Pattern	Example	Confidence
`firstname.lastname@domain`	`john.doe@acme.com`	85
`flastname@domain`	`jdoe@acme.com`	80
`firstname@domain`	`john@acme.com`	70
`f.lastname@domain`	`j.doe@acme.com`	65
`firstnamelastname@domain`	`johndoe@acme.com`	55
`firstname_lastname@domain`	`john_doe@acme.com`	40
`lastname.firstname@domain`	`doe.john@acme.com`	30
`lastname@domain`	`doe@acme.com`	25

Predicted emails are assigned quality tier C. Pair this actor with the LeadsLogix Email Verifier to validate predicted addresses before outreach.

Input

Provide domains using one of three methods. If multiple are provided, the actor uses the first one found in this priority order: inline domains, file upload, URL.

Input Schema

Field	Type	Required	Default	Description
`domains`	`array[string]`	No	--	Inline JSON array of domain strings. Example: `["acme.com", "globex.net"]`
`inputFile`	`string` (file)	No	--	Upload a CSV or Excel file with a `domain`, `domains`, `website`, or `url` column
`inputUrl`	`string` (URL)	No	--	Public URL to a CSV or Excel file with domains
`maxResults`	`integer`	No	`20`	Maximum number of domains to process. Controls pricing tier enforcement
`layers`	`string`	No	`"0,1,2"`	Comma-separated layer IDs to run. Options: `0` (DNS), `1` (crawl), `2` (search), `3` (Google)
`skipGoogle`	`boolean`	No	`true`	Skip Google Playwright layer. Overrides `layers` to exclude layer 3
`includeEmailPrediction`	`boolean`	No	`true`	Generate predicted emails for contacts with names but no discovered addresses
`proxyConfiguration`	`object`	No	--	Apify Proxy settings for crawl and search operations

Input File Format

Your CSV or Excel file needs at least one column with domain data. The actor auto-detects these column names (case-insensitive):

domain
domains
website
url

Full URLs are accepted -- the actor extracts the domain automatically (https://www.acme.com/about becomes acme.com).

Example CSV:

domain
acme.com
globex.net
initech.io
umbrella-corp.com

Output

Dataset Schema

Each discovered email is stored as one row in the Apify Dataset.

Field	Type	Description
`domain`	`string`	The input domain that was searched
`email`	`string`	Discovered or predicted email address (lowercase)
`source_layer`	`string`	Discovery method: `L0_DMARC`, `L0_SPF`, `L1_CRAWL`, `L2_SEARCH`, `NODEJS_PIPELINE`, or `PREDICTION`
`confidence`	`integer`	Confidence score from 0 to 100
`quality_tier`	`string`	`A` (crawl-verified, >=85), `B` (DNS/search, 60-84), or `C` (predicted, <60)

Quality Tiers Explained

Tier	Confidence Range	Source	Recommended Action
A	85-100	Direct website crawl, Google search	Safe for outreach. Email was found on a live web page.
B	60-84	DNS records, search engine snippets	Likely valid. Verify before high-volume sending.
C	Below 60	Pattern prediction engine	Unverified guess. Must verify before any use.

Additional Output

CSV file -- stored in Apify Key-Value Store as output.csv (UTF-8 with BOM for Excel compatibility)
Usage summary -- stored as usage (JSON) in Key-Value Store with input/output counts and pricing info

Example Output Row

{
    "domain": "acme.com",
    "email": "sales@acme.com",
    "source_layer": "L1_CRAWL",
    "confidence": 85,
    "quality_tier": "A"
}

Pricing

Tier	Actor Fee	Results	Best For
Free	$0	Up to 20 per run	Testing, evaluation
Pay-Per-Event	$2 per 1,000 results	Unlimited	Production workloads

Important: Apify platform compute charges (CPU time, memory, bandwidth) are billed separately by Apify based on your usage. The prices above cover the actor software license only. See Apify pricing for platform costs.

Cost Estimation

Scenario	Results	Actor Fee	Est. Platform Cost	Total
Quick test	20	$0 (free)	~$0.05	~$0.05
Small batch	100	$0.16	~$0.15	~$0.31
Medium batch	500	$0.96	~$0.50	~$1.46
Large batch	1,000	$1.96	~$1.00	~$2.96
Enterprise	10,000	$19.96	~$10.00	~$29.96

Actor fee formula: (results - 20) x $0.002 for results > 20, $0 for <= 20. Platform cost estimates assume Layers 0-2 enabled (no Google Playwright). Enabling Layer 3 increases compute time by approximately 3-5x due to browser overhead.

Usage Examples

Example 1: Quick Test with Inline Domains

{
    "domains": ["iana.org", "icann.org", "ietf.org"],
    "maxResults": 20,
    "layers": "0,1,2"
}

Example 2: CSV File Upload

{
    "inputFile": "prospects.csv",
    "maxResults": 500,
    "layers": "0,1,2",
    "includeEmailPrediction": true
}

Example 3: Remote CSV via URL

{
    "inputUrl": "https://docs.google.com/spreadsheets/d/.../export?format=csv",
    "maxResults": 1000,
    "layers": "0,1,2",
    "skipGoogle": true,
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
}

Example 4: Thorough Discovery with Google (Layer 3)

{
    "domains": ["hard-to-find-emails.com"],
    "maxResults": 20,
    "layers": "0,1,2,3",
    "skipGoogle": false,
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": ["RESIDENTIAL"]
    }
}

Example 5: DNS-Only Discovery (Fastest, No Web Requests)

{
    "domains": ["acme.com", "globex.net", "initech.io"],
    "maxResults": 20,
    "layers": "0"
}

Retrieving Results via API

After the run completes, fetch results from the Apify Dataset:

# Get all results as JSON
curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?token={API_TOKEN}"

# Get results as CSV
curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?token={API_TOKEN}&format=csv"

# Download the CSV file from Key-Value Store
curl "https://api.apify.com/v2/key-value-stores/{STORE_ID}/records/output.csv?token={API_TOKEN}"

Performance

Expected throughput with Layers 0-2 enabled (Layer 3 disabled):

Metric	Value
Domains per minute	4-6 (with human-like delays)
Average emails per domain	1-5 (varies by industry and region)
Layer 0 hit rate	10-20% of domains have emails in DNS records
Layer 1 hit rate	40-60% of domains have emails on contact/about pages
Layer 2 hit rate	30-50% of domains have emails indexed in search engines
Combined discovery rate	50-70% of domains yield at least one email
Memory usage	~256 MB (Layers 0-2), ~512 MB with Layer 3

Regional variation: Domains in Western Europe and North America typically have higher email discovery rates (60-80%). Asian domains (.kr, .cn, .jp) often have lower rates (10-30%) due to form-based contact pages, missing MX records, and non-Latin character email systems.

Noise Filtering

The actor automatically removes emails matching these patterns:

Filtered Local Parts

noreply, no-reply, donotreply, mailer-daemon, postmaster, hostmaster, webmaster, abuse

Filtered Domains

example.com, test.com, localhost, sentry.io, wixpress.com, w3.org, schema.org, googleapis.com, gstatic.com

Domain Matching

Layer 1 (crawl) only keeps emails that match the target domain exactly. For example, when crawling acme.com, an email like partner@otherdomain.com found on the page is discarded. Layers 0 and 2 apply domain-aware filtering to ensure relevance.

Proxy Configuration

For best results with Layers 1-3, use Apify Proxy:

Datacenter proxy -- sufficient for Layer 0 (DNS) and Layer 1 (site crawl). Lowest cost.
Residential proxy -- recommended for Layer 2 (search engines) and required for Layer 3 (Google). Reduces CAPTCHA risk.

If no proxy is configured, the actor runs requests from the Apify platform IP directly. This works for small batches but may trigger rate limits on search engines for larger runs.

Integrations

Chain with Other LeadsLogix Actors

This actor is part of the LeadsLogix B2B intelligence suite on Apify Store:

LeadsLogix Website Discovery -- find official websites for a list of company names
LeadsLogix Company Scraper -- crawl websites for company info and decision-maker contacts
LeadsLogix Email Discovery (this actor) -- discover emails for known domains
LeadsLogix Email Verifier -- verify discovered emails with a 6-check pipeline
LeadsLogix Pipeline -- run all stages in a single actor (end-to-end)

Recommended workflow: Website Discovery -> Company Scraper -> Email Discovery -> Email Verifier

Export Formats

Apify Dataset -- query via REST API, export as JSON, CSV, XML, or Excel
CSV -- download output.csv from Key-Value Store (UTF-8 with BOM)
Webhook -- configure Apify webhooks to POST results to your CRM or pipeline on run completion

Frequently Asked Questions

How is this different from Hunter.io, Snov.io, or Apollo? Those services query proprietary databases of previously discovered emails. This actor discovers emails in real-time by actually crawling websites, reading DNS records, and searching the public web. You get fresher results and do not pay for stale data. The tradeoff is longer run time per domain.

Will this actor find personal Gmail or Outlook addresses? No. The domain-matching filter in Layer 1 only keeps emails that match the target domain. If you crawl acme.com, only *@acme.com emails are returned. Gmail, Yahoo, and other free email provider addresses are filtered out.

What happens if I exceed the free tier limit? The actor processes up to the number of domains specified in maxResults. The default is 20 (free tier). To process more, increase maxResults for pay-per-event billing ($2 per 1,000 results beyond the free tier).

Can I run this on my own Apify account without paying the actor fee? The free tier (20 domains per run) has no actor fee -- you only pay Apify platform compute charges. For larger runs, pay-per-event pricing applies at $2 per 1,000 results beyond the free tier.

Does Layer 3 (Google Playwright) always work? Not reliably. Google actively blocks automated access and may serve CAPTCHAs. Using residential proxies improves success rates, but Layer 3 should be treated as a last resort. Layers 0-2 cover the majority of discoverable emails without Google.

How do I verify the emails this actor finds? Pair this actor with the LeadsLogix Email Verifier, which runs syntax, DNS, SMTP, catch-all, disposable, and DKIM/SPF/DMARC checks. Feed this actor's output CSV directly into the verifier.

Can I process Excel files? Yes. Both .csv and .xlsx/.xls files are supported via file upload or URL. The actor auto-detects the file format and looks for a domain column.

What if my file has URLs instead of domains? The actor extracts domains from full URLs automatically. A column containing https://www.acme.com/about will be parsed as acme.com.

Is there a rate limit? The actor inserts human-like delays (1-2 seconds between requests) to avoid triggering target website rate limits. For search engines, a 2-second delay is used between queries. There is no hard rate limit on the actor itself beyond the maxResults setting.

Does the actor respect robots.txt? Layer 1 (site crawl) requests pages via httpx like a standard browser and follows HTTP redirects. It does not explicitly parse robots.txt, but the fixed set of common paths (/contact, /about, /team, /impressum) are pages intended for public access. No recursive spidering is performed.

Limitations

Email prediction (Tier C) is unverified. Predicted emails are pattern-based guesses. Always verify before sending.
Asian and non-English domains have lower discovery rates (10-30%) due to form-based contact pages and non-Latin email systems.
Layer 3 (Google) is unreliable due to CAPTCHA enforcement. Do not depend on it for production workloads.
The actor discovers published emails only. It does not access private databases, social media DMs, or login-protected pages.
No SMTP verification is performed. Use the LeadsLogix Email Verifier actor to confirm deliverability.
Search engine rate limits may reduce Layer 2 effectiveness during very large runs (1,000+ domains). Space large batches 1-2 hours apart or use residential proxies.

Changelog

v1.0.0 (2026-05-08)

Initial release on Apify Store
4-layer discovery pipeline: DNS/OSINT, site crawl, DuckDuckGo search, Google Playwright
8-pattern email prediction engine
Noise email and noise domain filtering
Quality tiering (A/B/C) with confidence scoring (0-100)
CSV and Dataset dual output
Free tier (20 domains per run), pay-per-event ($2/1,000 results beyond free tier)
Apify Proxy integration (datacenter and residential)
Auto-detection of domain column from CSV/Excel input
URL-to-domain extraction for input files with full URLs

Support

Issues and feature requests: Open an issue on the Apify Store actor page
Email: hello@leadslogix.com
Documentation: LeadsLogix on GitHub

LeadsLogix Email Discovery is a B2B email finder and domain email search tool built for sales intelligence, lead generation, email prospecting, and market research. It discovers business email addresses from company domains without relying on third-party databases.

Marine Layer Scraper

mshopik/marine-layer-scraper

Scrape Marine Layer and extract data on apparel from marinelayer.com. Our Marine Layer API lets you crawl product information and pricing. The saved data can be downloaded as HTML, JSON, CSV, Excel, and XML.

Mark Carter

Context Layer

evertools/context-layer

Transforms documentation sites into a clean, structured context layer for AI systems—handling crawling, extraction, intelligent chunking, and optional enrichment for RAG, fine-tuning, and semantic search.

Mike

Free YouTube Channel Email Extractor

s-r/free-youtube-channel-email-reveal

Extract business email addresses from YouTube channels — multi-layer approach: channel description regex + linked-website crawl. Honest no-email reporting (never guesses).

Pubrio - Business Data Layer for AI

pubrio/pubrio-api

We build the glocalized business data layer for AI agents and revenue teams. Pubrio provides B2B data intelligence APIs covering company firmographics, people profiles, verified contact details, job postings, news, and advertisement signals across global markets.

King Lai

5.0

Email Validator - 8 Layer Verification

lazymac/email-validator-api

Validate email addresses with 8 layers of verification: syntax check, MX record lookup, SMTP verification, disposable email detection, role-based detection, typo suggestion, DNS validation, and deliverability scoring. Supports bulk validation.

2x lazymac

Email Validator & Verifier

junipr/email-validator

6-layer email validation: format, MX records, SMTP mailbox verify, disposable detection (5K+ domains), role-based check, free provider ID. Quality score 0-100 per email. Typo suggestions. Batch processing.

junipr

Email Address Validator (2026) Letest

datascoutapi/email-address-validator

Email validator - verify 10000 emails/run. Bulk email validation checks MX records, disposable/role-based detection. Improve deliverability & reduce bounces. 5-layer validation, detailed reports & smart DNS caching. Fast, accurate, cost-effective email list cleaning. Start free!

halam

5.0

Google News Scraper Comprehensive

scrapeio/google-news-scraper

Enter a keyword and collect up to 2,000 deduplicated Google News articles—headlines, publisher URLs, dates, sources, and snippets—from the public RSS layer. Excel‑ready CSV (UTF‑8 BOM, quoted fields) plus JSON. Perfect for brand monitoring, PR measurement, and news datasets. No Google Cloud API key.

Shop Intel

5.0

ViralTube: AI Transcript & Content Engine

adray_soft/viraltube-ai-transcript-content-engine

🚀 The most reliable YouTube-to-Content converter. Uses a unique 3-Layer Anti-Ban System (Invidious Mirrors + Resi-Proxies) to extract transcripts when others fail. Instantly generates Viral Tweets, LinkedIn Posts, and Shorts Scripts using Google Gemini AI.

AdRay AI Solutions

Lead Engine

xmiso_scrapers/lead-engine

Lead Engine

Miso

Leadslogix Email Discovery

Why This Actor

Key Features

How the Discovery Layers Work

Layer 0 -- DNS/OSINT (Passive)

Layer 1 -- Site Crawl (Direct)

Layer 2 -- Search Engine Discovery

Layer 3 -- Google Playwright (Optional, Skipped by Default)

Email Prediction Engine

Input

Input Schema

Input File Format

Output

Dataset Schema

Quality Tiers Explained

Additional Output

Example Output Row

Pricing

Cost Estimation

Usage Examples

Example 1: Quick Test with Inline Domains

Example 2: CSV File Upload

Example 3: Remote CSV via URL

Example 4: Thorough Discovery with Google (Layer 3)

Example 5: DNS-Only Discovery (Fastest, No Web Requests)

Retrieving Results via API

Performance

Noise Filtering

Filtered Local Parts

Filtered Domains

Domain Matching

Proxy Configuration

Integrations

Chain with Other LeadsLogix Actors

Export Formats

Frequently Asked Questions

Limitations

Changelog

v1.0.0 (2026-05-08)

Support

You might also like

Marine Layer Scraper

Context Layer

Free YouTube Channel Email Extractor

Pubrio - Business Data Layer for AI

Email Validator - 8 Layer Verification

Email Validator & Verifier

Email Address Validator (2026) Letest

Google News Scraper Comprehensive

ViralTube: AI Transcript & Content Engine

Lead Engine