Email Verifier & List Cleaner — Deliverability Scoring
Pricing
$4.00 / 1,000 dataset item scrapeds
Email Verifier & List Cleaner — Deliverability Scoring
Honest email deliverability scorer for scraped lead lists. Syntax + MX + disposable + role + catch-all detection with a transparent 0-100 score and reasons. Never labels an email 'deliverable' it cannot justify from reliable signals — catch-all and SMTP-blocked emails are honestly marked 'risky' ...
Pricing
$4.00 / 1,000 dataset item scrapeds
Rating
0.0
(0)
Developer
Harry Schoeller
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Email Verifier & List Cleaner — Honest Deliverability Scoring
Email verification and bounce checking for scraped lead lists — with a transparent 0-100 score and the reasons behind it. Clean your list of bad, disposable, role, and catch-all addresses before you pay a SaaS verifier (ZeroBounce, NeverBounce, Hunter, Kickbox, Bouncer) for the survivors. Built for lead enrichment and list cleaning pipelines.
Keywords: email verification, email validation, bounce checker, email deliverability, list cleaning, lead enrichment, lead-list cleaner, disposable email detection, role account detection, catch-all detection, MX lookup.
The honesty thesis
The #1 complaint across lead-gen scrapers (LinkedIn / Apollo / Google Maps, and the Secretary-of-State and license actors in this collection) is that scraped emails bounce. The market is full of "verifiers" that return a confident valid that bounces anyway.
This actor is built on one rule:
It never says
deliverableunless it can actually justify it from reliable signals. Everything else is honestly labeledriskyorunknown— never a fakedvalid.
That single rule turns the #1 complaint into the #1 reason for a 5-star review: no surprise bounces.
What is reliable vs. unreliable from a datacenter
This is the spine of the design. Read it before you run anything — it explains exactly what we can and cannot promise.
| Check | Reliable from a datacenter? | Role in the score |
|---|---|---|
| RFC 5322 syntax / normalization | ✅ Fully | Hard gate (fail → undeliverable) |
| MX record lookup (DNS) | ✅ Fully | Hard gate (no MX & no A fallback → undeliverable) |
| Domain exists (A/AAAA/NS resolves) | ✅ Fully | Hard gate |
| Disposable-domain detection (maintained list) | ✅ Fully | Downgrade to risky |
| Role-account detection (info@, sales@, admin@…) | ✅ Fully | Downgrade to risky |
| Gibberish / random-string heuristics | ✅ (heuristic, never a hard fail alone) | Score penalty + reason |
| Typo / known-provider misspelling (gmial.com) | ✅ (suggestion only) | Reason + suggestion field |
| Catch-all (accept-all) detection | ⚠️ Best-effort SMTP — often inconclusive | Caps confidence; sets isCatchAll, downgrades to risky |
| Live SMTP RCPT probing of the mailbox | ❌ Unreliable — port 25 egress is blocked/greylisted from shared datacenter ranges; catch-all domains accept everything; Google/Microsoft refuse or rate-limit | Can only downgrade or yield unknown — NEVER promote to deliverable |
Why we don't do binary valid/invalid
Most cheap "verifiers" do an SMTP RCPT probe from an Apify datacenter IP and return binary valid / invalid. That fails in two predictable ways:
- Datacenter SMTP is unreliable. Port 25 egress from shared datacenter ranges is routinely blocked, greylisted, or rate-limited. Big providers (Google, Microsoft) refuse or throttle these probes. So a "valid" from that probe is often noise.
- Catch-all domains accept everything. A catch-all (accept-all) domain returns
250 OKfor any address —ceo@,asdfqwer@, anything. A binary verifier marks them allvalid; then they bounce.
We refuse to launder either of those into a confident valid. Instead:
- SMTP is strictly one-directional. A hard
5xxrejection is trustworthy → we downgrade toundeliverable. A250acceptance is ignored for promotion — it can never make an emaildeliverable. - Catch-all caps the result at
riskyand setsisCatchAll: true, because per-mailbox deliverability there is genuinely unknowable. - Blocked / greylisted / timeout →
unknown. An honest "we don't know" beats a false "valid."
Status definitions (the honest contract)
deliverable— Syntax valid, MX present, NOT disposable, NOT a detected catch-all, and NOT (by policy) a role account. This is "all reliable signals are green." It is not a delivery guarantee, and we will never label somethingdeliverablethat we cannot justify from reliable signals. SMTP is never required to reach this status, and SMTP can never be the thing that grants it.risky— Real-looking but with a known risk: role account, catch-all domain, disposable, gibberish-leaning local part, or a typo suggestion. Send if you want, but expect a lower hit rate / spam-trap risk.undeliverable— A hard, provable failure only: bad syntax, no MX and no A-record fallback, the domain doesn't resolve, or a trustworthy SMTP5xxrejection. We only condemn an email when the failure is provable.unknown— We couldn't determine it: DNS timeout/SERVFAIL, or an SMTP probe that was blocked/greylisted/inconclusive.
How the score stays honest
Deterministic, capped pipeline — not a black box. Order matters: hard gates first, then deductions, then caps that enforce the honesty rule.
Start: score = 100, status = "deliverable"HARD GATES (terminal):invalid syntax -> undeliverable, 0domain doesn't resolve, no MX -> undeliverable, 5DNS errored/timed out -> unknown, nullDEDUCTIONS:no MX but A record (fallback) -> -30gibberish local part -> -25possible provider typo -> -15 (+ `suggestion`)DOWNGRADE FLAGS (cap status, never upgrade):isDisposable -> cap "risky", score min 40isRole (if treatRoleAsRisky) -> cap "risky", score min 60isCatchAll -> cap "risky", score min 65isFreeProvider -> informational onlySMTP (best_effort only, one-directional):5xx rejection -> undeliverable, 10 (trustworthy)250 acceptance -> NO promotion (untrustworthy)blocked/greylisted/4xx/timeout -> unknown (if it was deliverable)FINAL CAP: status can be "deliverable" ONLY if syntax ok AND mx present AND!isDisposable AND !isCatchAll AND (!isRole || !treatRoleAsRisky) AND no smtp rejection.
Every record carries its full reasons[], so you can audit exactly why a score landed where it did.
Input
Provide exactly one of emails or sourceDatasetId.
{// Option A — raw list:"emails": ["john@acme.com", "info@acme.com"],// Option B — chain off another actor's dataset:"sourceDatasetId": "<dataset id>","emailField": "email", // dot paths supported (contact.email)"passThroughFields": ["name", "company", "phone"],"smtpCheck": "off", // "off" (recommended) | "best_effort""detectCatchAll": true,"deduplicate": true,"treatRoleAsRisky": true,"concurrency": 50}
Output (per email)
{"email": "info@acme.com","originalEmail": "Info@Acme.com ","status": "risky","score": 55,"reasons": ["valid_syntax", "mx_found:aspmx.l.google.com", "role_account:info", "domain_is_catch_all"],"mx": ["aspmx.l.google.com", "alt1.aspmx.l.google.com"],"isDisposable": false,"isRole": true,"isCatchAll": true,"isFreeProvider": false,"suggestion": null,"domain": "acme.com","checkedAt": "2026-06-20T00:00:00.000Z"// ...any passThroughFields copied here}
A run-level OUTPUT key holds counts per status plus the honesty disclaimer.
Chaining example (lead enrichment)
Run a leads scraper from this collection — e.g. insurance-license-search or sos-entity-search — then feed its dataset straight in:
{"sourceDatasetId": "<the leads run's defaultDatasetId>","emailField": "email","passThroughFields": ["entityName", "phone", "city", "state"]}
The cleaned, scored list comes back joined to each lead, ready to import into your CRM or hand the deliverable subset to an outreach tool.
Pricing (pay per event)
- $0.004 per email verified (charged once per unique email after dedup).
- $0.002 per best-effort SMTP probe (only when
smtpCheck: "best_effort"and a probe is actually made).
Cost is decoupled from compute — you pay for results, not runtime. We never charge for inputs rejected at validation or for duplicates collapsed by dedup.
Notes
- No browser, no proxy, no anti-bot, no login. Pure DNS + string heuristics + maintained lists, with an optional
node:netSMTP socket. - The disposable-domain list is vendored in-repo so runs are offline-deterministic.
- Competitor SaaS (ZeroBounce, NeverBounce, Hunter, Kickbox, Bouncer) use warmed dedicated IPs and accept-all intelligence an Apify run cannot replicate — we don't claim parity. Use this to pre-clean cheaply, then pay a SaaS for the survivors if you need deeper SMTP accuracy. (Competitor positioning is from general Apify Store knowledge; verify current ratings before relying on any comparison.)