# Email Verifier (`labrat011/email-verifier`) Actor

Bulk email verification with real SMTP handshakes, DNS MX checks, and catch-all detection. Verify emails before outreach to cut bounces and protect sender reputation.

- **URL**: https://apify.com/labrat011/email-verifier.md
- **Developed by:** [mick\_](https://apify.com/labrat011) (community)
- **Categories:** Agents, Automation, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event + usage

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Email Verifier & Deliverability Checker

Bulk email verification actor for Apify.

### Understanding your results (read this first)

Every email returns one of four statuses. Here is what each means and what to do:

| status | Meaning | What to do | Billed? |
|--------|---------|-----------|---------|
| **valid** | Mailbox exists and accepts mail (confirmed by SMTP). | Safe to send. | ✅ Yes |
| **invalid** | Address is bad — syntax error, no mail server, or the mailbox was rejected. | Do not send. Remove it. | ✅ Yes |
| **risky** | Real domain, but the mailbox can't be confirmed — it's a catch-all, disposable, or hosted on a provider that blocks verification (see below). | Your call. Often deliverable, but lower confidence. | ✅ Yes |
| **unknown** | We couldn't reach the mail server to get an answer (timeout/blocked). | Re-run later, or treat as unverified. | ❌ **Free** |

**You are only charged for `valid`, `invalid`, and `risky` (plus one `actor-start` per run). `unknown` is always free.**

#### "My real Gmail / Outlook / company address came back `risky` — is that a bug?"

No. Large providers — **Gmail, Outlook/Microsoft 365, Yahoo, iCloud, Proton, Tutanota, Zoho, and any custom domain hosted on them** — deliberately refuse external mailbox checks. They answer "maybe" to every probe to stop spammers from harvesting valid addresses. **No verification service on earth can return a definitive `valid`/`invalid` for these** — anyone claiming otherwise is guessing. We return `risky` with reason `provider_unverifiable`, which is the honest answer: the domain is real and likely deliverable, we just can't confirm the individual mailbox. These addresses are generally safe to email.

#### "Why did I get `unknown`? Did the tool fail?"

`unknown` means the destination mail server didn't give us a usable answer in time (it timed out, greylisted us, or blocked the connection). It is **not** a failure or an error, and it is **never charged**. Re-running later sometimes resolves it. It's most common for small self-hosted mail servers.

### Features

- Syntax validation via Python email module
- Disposable domain detection (embedded list, zero network)
- Role account flagging (info, sales, admin, etc.)
- MX record resolution, cached per domain
- Catch-all detection via SMTP probe
- SMTP RCPT TO handshake
- Honesty rule for Gmail, Outlook, Yahoo, iCloud, ProtonMail
- MX-host honesty rule: detects custom domains hosted on Google Workspace,
  Microsoft 365, Tutanota, Proton, Zoho, Proofpoint, Mimecast, NetEase, Yandex,
  Mail.ru, Tencent QQ, etc.
- Fail-fast SMTP: short connect timeout and single primary-MX probe so blocked
  or slow servers return `unknown` quickly instead of burning the full timeout
- Bounded concurrency (1-50)
- Per-email timeout
- Pay-per-event pricing

### Input

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| emails | array[string] | - | List of emails (1-100,000) |
| concurrency | int | 10 | Parallel emails (1-50) |
| perEmailTimeoutMs | int | 8000 | Hard timeout per email (ms) |
| verifyCatchAll | bool | true | Probe for catch-all domains |

### Output

| Field | Type | Description |
|-------|------|-------------|
| email | string | Verified email address |
| status | string | valid, invalid, risky, or unknown |
| reason | string | Machine-readable reason code |
| score | int | Deliverability score 0-100 |
| mxFound | bool | Domain has valid MX records |
| isDisposable | bool | Known disposable provider |
| isRoleAccount | bool | Role/function address |
| isFreeProvider | bool | Free consumer provider |
| isCatchAll | bool | Domain accepts all mail |

`score` is a 0–100 deliverability confidence: roughly 90+ confirmed valid, 45–85
deliverable-but-unconfirmed (`risky`), 0 invalid/undeliverable. Use it to rank or
threshold a list; the `status` field is the headline verdict.

#### Reason codes

- bad_syntax - email fails RFC syntax
- disposable - domain is a known disposable provider
- no_mx - domain has no MX records
- mailbox_not_found - SMTP rejected (5xx)
- catch_all - domain accepts any address
- provider_unverifiable - major provider or MX hosted on one (Google Workspace,
  M365, Tutanota, Proton, etc.); mailbox can't be verified externally
- smtp_timeout - connection timed out or blocked
- smtp_blocked - connection refused or blocked

### Notes & limitations

- **SMTP from cloud IPs**: outbound port 25 is often blocked and many MX servers
  greylist or refuse unknown senders, so SMTP probes may return `unknown`
  (`smtp_timeout`). Unknown results are not charged. Domains hosted on major
  providers (Google Workspace, M365, Tutanota, Proton, Zoho, Proofpoint,
  Mimecast, NetEase, Yandex, Mail.ru, Tencent QQ) are short-circuited to
  `risky/provider_unverifiable` before the SMTP step, since those providers block
  external mailbox verification regardless of source IP — no provider, IP, or
  tool can confirm those mailboxes externally.
- **Fail-fast & cost**: blocked/slow SMTP aborts at a short connect timeout (~3s)
  and only the primary MX is probed, so `unknown` results resolve quickly. The
  MX-host list is intentionally conservative — when a provider can't be confirmed
  to block verification, the address is left to a real SMTP probe and may come
  back `unknown` (free) rather than being wrongly charged as `risky`.
- **mxFound semantics**: reflects whether an MX lookup ran and succeeded. Stages that
  short-circuit before the MX lookup (disposable) report `false` even though the domain
  may have MX. Major providers report `true` (known to have MX; not SMTP-probed).
- **Catch-all detection**: a single random-address RCPT probe. A server that accepts
  then later bounces will be reported as catch-all.

### Pricing

Pay-per-event (PPE) — you only pay for results, not for runtime:

- **actor-start** — charged once per run.
- **email-verified** — charged once per email that returns `valid`, `invalid`, or
  `risky` (any conclusive answer, including `provider_unverifiable` and `catch_all`).
- **`unknown` results are always free** — if we can't reach the server to get an
  answer, you pay nothing for that email.

So a run of 1,000 emails where 50 come back `unknown` bills `actor-start` + 950
`email-verified` events, not 1,000.

### Development

````

## Install dependencies

pip install -r requirements.txt

## Run locally

ACTOR\_TEST\_PAY\_PER\_EVENT=1 python3 -m src

## Build Docker image

docker build -t email-verifier .

````

### Architecture

src/
  __init__.py     Package marker
  __main__.py     Entry point + logging
  main.py         Actor orchestration
  models.py       Pydantic input/output models
  verifier.py     verify_email() chains all checks
  checks.py       Pure helpers

#### Verification stages

1. Syntax check - RFC validation, no network
2. Disposable domain - in-memory set lookup
3. Role account - prefix match (flag only)
4. Major provider - honesty rule (risky)
5. MX DNS lookup - cached per domain
5b. MX-host honesty rule - if MX is a verification-blocking provider, stop (risky)
6. Catch-all probe - cached per domain
7. SMTP RCPT TO - definitive result

# Actor input Schema

## `emails` (type: `array`):

List of email addresses to verify. One result row is produced per address. Duplicates are de-duplicated automatically (a domain is resolved only once per run).
## `concurrency` (type: `integer`):

How many emails to verify in parallel. Higher is faster but uses more compute and is more likely to trip SMTP rate limits. Leave at default unless you know you need more.
## `perEmailTimeoutMs` (type: `integer`):

Hard timeout for each email's network checks (MX + SMTP). On timeout the email is returned as "unknown" rather than hanging the run. You are NOT charged for unknown results.
## `verifyCatchAll` (type: `boolean`):

When on, the actor sends one extra SMTP probe per domain to detect accept-all (catch-all) servers and flags those addresses as "risky". Turn off to run slightly cheaper if you don't care about catch-all detection.

## Actor input object example

```json
{
  "emails": [
    "john@example.com",
    "info@apify.com",
    "not-an-email",
    "test@gmail.com"
  ],
  "concurrency": 10,
  "perEmailTimeoutMs": 8000,
  "verifyCatchAll": true
}
````

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "emails": [
        "john@example.com",
        "info@apify.com",
        "not-an-email",
        "test@gmail.com"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("labrat011/email-verifier").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "emails": [
        "john@example.com",
        "info@apify.com",
        "not-an-email",
        "test@gmail.com",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("labrat011/email-verifier").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "emails": [
    "john@example.com",
    "info@apify.com",
    "not-an-email",
    "test@gmail.com"
  ]
}' |
apify call labrat011/email-verifier --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=labrat011/email-verifier",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Email Verifier",
        "description": "Bulk email verification with real SMTP handshakes, DNS MX checks, and catch-all detection. Verify emails before outreach to cut bounces and protect sender reputation.",
        "version": "0.2",
        "x-build-id": "mQcRLywYDJFoN6uAE"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/labrat011~email-verifier/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-labrat011-email-verifier",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/labrat011~email-verifier/runs": {
            "post": {
                "operationId": "runs-sync-labrat011-email-verifier",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/labrat011~email-verifier/run-sync": {
            "post": {
                "operationId": "run-sync-labrat011-email-verifier",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "emails"
                ],
                "properties": {
                    "emails": {
                        "title": "Emails to verify",
                        "minItems": 1,
                        "maxItems": 100000,
                        "type": "array",
                        "description": "List of email addresses to verify. One result row is produced per address. Duplicates are de-duplicated automatically (a domain is resolved only once per run).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "concurrency": {
                        "title": "Concurrency",
                        "minimum": 1,
                        "maximum": 50,
                        "type": "integer",
                        "description": "How many emails to verify in parallel. Higher is faster but uses more compute and is more likely to trip SMTP rate limits. Leave at default unless you know you need more.",
                        "default": 10
                    },
                    "perEmailTimeoutMs": {
                        "title": "Per-email timeout (ms)",
                        "minimum": 2000,
                        "maximum": 30000,
                        "type": "integer",
                        "description": "Hard timeout for each email's network checks (MX + SMTP). On timeout the email is returned as \"unknown\" rather than hanging the run. You are NOT charged for unknown results.",
                        "default": 8000
                    },
                    "verifyCatchAll": {
                        "title": "Probe catch-all domains",
                        "type": "boolean",
                        "description": "When on, the actor sends one extra SMTP probe per domain to detect accept-all (catch-all) servers and flags those addresses as \"risky\". Turn off to run slightly cheaper if you don't care about catch-all detection.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
