Local Business Leads — Verified Emails & Contacts avatar

Local Business Leads — Verified Emails & Contacts

Pricing

$4.00 / 1,000 dataset item scrapeds

Go to Apify Store
Local Business Leads — Verified Emails & Contacts

Local Business Leads — Verified Emails & Contacts

Local lead generation that crawls SMB websites and returns verified business emails, phones and socials. Bring a niche + city (optional Maps discovery) or your own site list. Honest email verification: never marks an email 'deliverable' when it's catch-all or unknown. Outreach-ready leads.

Pricing

$4.00 / 1,000 dataset item scrapeds

Rating

0.0

(0)

Developer

Harry Schoeller

Harry Schoeller

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

8 days ago

Last modified

Share

Local lead generation that actually stays working. Give this Actor a niche + city (or your own list of business websites) and it returns outreach-ready leads: a verified business email, the contact name, phone, and social profiles for each company — one clean row per business.

It is a focused business email finder and local lead generation engine built around one durable idea: a plumber's, dentist's, or law firm's own website has no anti-bot defense, so crawling it for contact details is cheap, fast, and reliable for years. Every email then runs through honest, in-process email verification (syntax + MX + disposable/role/catch-all + score) — so you get verified emails, not a raw dump.

Keywords: business leads, email finder, local lead generation, verified emails, B2B leads, contact scraper, lead enrichment, small business email list, sales prospecting.


Why this Actor (the honesty wedge)

Most local-leads scrapers dump raw Google Maps rows and leave you to guess which emails are real. This Actor does the opposite:

  • Verified, not guessed. Each email gets a status (deliverable / risky / undeliverable / unknown) and a transparent 0–100 score with reasons. An email is labeled deliverable only when every reliable signal is green. Catch-all domains and SMTP-blocked checks are honestly marked risky or unknownnever laundered into deliverable.
  • Real contacts, never fabricated. Every email, phone and social URL is extracted verbatim from the business's own site (mailto: links, JSON-LD, Cloudflare-obfuscated addresses, visible text). Nothing is hallucinated.
  • Role mailboxes kept, not dropped. For small businesses, info@ / office@ is frequently the only reachable address. We keep it and label it (primaryEmailType: "role", downgraded to risky) so you decide. Honest beats empty.

Three ways to feed it (all converge on the same reliable core)

You can use any one — or combine them.

1. Bring your own websites (default, zero extra cost)

{
"websites": ["acme-plumbing.com", "https://brightsmiledental.com"]
}

Bare domains or full URLs both work.

2. Chain off an existing dataset (Maps scraper output, a CRM export, etc.)

{
"sourceDatasetId": "<DATASET_ID>",
"websiteField": "website",
"passThroughFields": ["categoryName", "totalScore", "address"]
}

This lets the Actor sit downstream of any Google Maps scraper and upstream of email-verifier or a CRM importer.

3. Discover by category + location (optional Google Maps discovery)

{
"categorySearch": { "category": "dentist", "location": "Austin, TX", "maxPlaces": 50 },
"enableMapsDiscovery": true
}

When enableMapsDiscovery is on, the Actor calls a maintained Google Maps scraper (mapsActorId, default compass/crawler-google-places) via Actor.call, then enriches + verifies its results.

⚠️ Cost disclosure: Maps discovery runs a separate upstream Actor that bills your Apify account independently at its own rate. This Actor depends only on that scraper's dataset shape (name, address, phone, website) — never on Google Maps internals — so if a provider degrades you can swap mapsActorId in one line, and a Maps outage simply degrades to "bring your own list" instead of breaking the product.


Output: one outreach-ready row per business

{
"businessName": "Bright Smile Dental",
"category": "dentist",
"website": "https://brightsmile.com",
"domain": "brightsmile.com",
"address": "123 Main St, Austin, TX",
"location": "Austin, TX",
"primaryEmail": "dr.lee@brightsmile.com",
"primaryEmailStatus": "deliverable", // deliverable | risky | undeliverable | unknown
"primaryEmailScore": 92, // 0–100, honest
"primaryEmailType": "personal", // personal | role | unknown
"contactName": "Dr. Susan Lee", // best-effort, null if not found
"phone": "+1-512-555-0100",
"emails": [
{ "email": "dr.lee@brightsmile.com", "status": "deliverable", "score": 92,
"isRole": false, "isCatchAll": false, "isDisposable": false,
"isFreeProvider": false, "source": "mailto", "reasons": ["valid_syntax", "mx_found:..."] }
],
"phones": ["+1-512-555-0100"],
"socials": { "facebook": "https://facebook.com/...", "instagram": null,
"linkedin": null, "twitter": null, "youtube": null, "tiktok": null },
"leadQuality": "high", // high | medium | low
"emailFound": true,
"verifiedDeliverable": true,
"websiteReachable": true,
"pagesCrawled": 3,
"discoverySource": "websites_input", // maps_call | websites_input | source_dataset
"crawledAt": "2026-06-20T18:00:00Z",
"passThrough": { "totalScore": 4.6 }
}

leadQuality rubric (deterministic)

  • high — personal email that is deliverable/risky and a phone is present.
  • medium — a role email that is deliverable/risky, OR a personal email that is unknown, OR a usable personal email with no phone.
  • low — email undeliverable, or no email found (still emitted with emailFound:false unless requireDeliverable is on).

A run-level OUTPUT summary records { totalBusinesses, withEmail, deliverable, risky, unknown, undeliverable, sitesUnreachable, discoverySource, disclaimer }.


How it works

  1. Discover → resolve websites from websites[], a sourceDatasetId, and/or optional Maps discovery; dedup by domain so each business is crawled once.
  2. Crawl → a fast HTTP (Cheerio) crawl of each site: the homepage plus only the same-domain pages whose path/anchor text signals contact intent (/contact, /about, /team, impressum, …), capped by maxPagesPerSite. A polite, focused crawl — not a full site walk. Set renderJs: true only for the rare JS-only site.
  3. Extract → emails (mailto: › JSON-LD › Cloudflare cfemail › visible-text regex), phones (tel: › JSON-LD › strict regex), socials (profile URLs, tracking pixels filtered out), and a best-effort contact name.
  4. Dedup & rank → per business: rank emails by source confidence and personal-over-role; globally: flag emails shared across businesses.
  5. Verify (in-process)syntax → MX/DNS → catch-all → optional best-effort SMTP → honest score. Same engine and honesty contract as the Email Verifier Actor, copied in-process so verification adds no second run.

Chaining with Email Verifier

This Actor verifies emails itself, so a single run gives you verified leads. For a deeper re-scoring pass, its output chains zero-glue into the Email Verifier Actor:

  • Set the verifier's sourceDatasetId to this Actor's default dataset ID,
  • emailField: "primaryEmail",
  • passThroughFields: ["businessName", "website", "phone"].

The shared passThroughFields convention means Maps scraper → Local Business Leads → Email Verifier forms a clean pipeline where each link is independently reliable.


Pricing (pay-per-event)

EventWhenPrice
Lead enriched & verifiedOnce per unique business after dedup (site crawled + contacts extracted + emails verified)$0.003 (= $3 / 1,000 leads)
Best-effort SMTP probeOnly when smtpCheck = best_effort and a probe is actually attempted$0.002

Charging stops gracefully at your PPE budget cap (budgetStopped: true) — it never overcharges and never crashes. When enableMapsDiscovery is on, the upstream Maps Actor bills you separately.


Key inputs

InputDefaultNotes
websitesBusiness sites or bare domains.
categorySearch + enableMapsDiscoveryoffOptional Maps discovery (upstream Actor billed separately).
mapsActorIdcompass/crawler-google-placesSwappable upstream provider.
sourceDatasetId / websiteField– / websiteChain off another dataset.
passThroughFields[]Carry source fields onto each lead.
maxPagesPerSite5Focused crawl budget per business.
renderJsfalsePlaywright for JS-only sites (slower).
smtpCheckoffbest_effort may downgrade / mark unknown — never promotes.
detectCatchAlltrueCatch-all domains capped at risky.
treatRoleAsRiskytrueRole mailboxes kept + flagged.
requireDeliverablefalseEmit only leads with a deliverable/risky email.
maxLeads00 = unlimited.
proxyConfigurationApify proxyDatacenter proxy is fine for SMB sites.

Honest limitations

  • We extract only what a site publishes. A business that hides its email behind a JS contact form may yield no email (emailFound: false) — we say so rather than invent one.
  • SMTP from datacenter IPs is unreliable (port 25 is often blocked). best_effort SMTP can only downgrade confidence or return unknown; it will never mark an email deliverable.
  • Catch-all domains accept every address, so per-mailbox deliverability is unknowable — those are honestly capped at risky.