B2B Lead Generation Suite - Find Emails, Score & Qualify Leads
Pricing
from $250.00 / 1,000 lead enricheds
B2B Lead Generation Suite - Find Emails, Score & Qualify Leads
All-in-one B2B lead pipeline. Enter company URLs, get enriched leads with emails, phone numbers, contacts, email patterns, quality scores (0-100), grades, and business signals from a 3-step automated pipeline.
Pricing
from $250.00 / 1,000 lead enricheds
Rating
0.0
(0)
Developer
Ryan Clinton
Maintained by CommunityActor stats
0
Bookmarked
47
Total users
5
Monthly active users
5 days ago
Last modified
Categories
Share
B2B Lead Generation Suite
An all-in-one B2B lead pipeline that turns a list of company websites into a send-ready outreach list. The suite runs a 3-step pipeline by default — Website Lead Intelligence (formerly Website Contact Scraper), Email Pattern Finder, B2B Lead Qualifier — and an opt-in 4th step (Bulk Email Verifier) for cold-outreach-grade deliverability decisions. From a single run you get send actions, verified emails, named decision-makers, buying-committee classification, and a 0–100 quality score per domain.
Provide one or more company URLs (e.g., stripe.com, https://buffer.com) and receive a unified dataset where every lead includes:
- A send action —
SEND_NOW/VERIFY_FIRST/SKIP/ENRICH_MOREper domain - A buying committee — decision-makers, influencers, champions, and blockers grouped by role
- A first-touch opening line stem — generated deterministically from job title + company type
- A pipeline-value rank — relative priority within the batch (1 = best)
- A plain-English summary — one-sentence takeaway you can paste into Slack or an email
- Verified emails, phone numbers, named contacts, social media links, addresses
- Email pattern + generated team emails (Step 2)
- 0–100 quality score + letter grade (Step 3)
- Per-email deliverability decisions (Step 4 — optional)
The entire pipeline runs automatically with no manual intervention between steps.
Which actor should I use?
This suite isn't always the right starting point. If you only need one step, run the dedicated sub-actor and skip orchestration overhead:
| You have… | You want… | Run this | Cost |
|---|---|---|---|
| Company URLs | Send-ready leads + decisions | Website Lead Intelligence (Step 1 standalone) | $0.20/domain |
| Domains + names | Pattern-detected emails for given names | Email Pattern Finder (Step 2 standalone) | $0.10/domain |
| Domains | 0–100 lead quality score + grade only | B2B Lead Qualifier (Step 3 standalone) | $0.15/lead |
| Business names or marketing phrases (no URLs) | Resolve to deduped website URLs | SERP Name Resolver | $0.002/query |
| Company URLs | All of the above merged into one record per domain — contacts, send action, pattern emails, lead score | B2B Lead Generation Suite (this actor) | $0.30–$0.45/lead |
Use this suite when you want a single dataset with the full picture per lead. Use a standalone sub-actor when you only need one layer.
Why Use B2B Lead Generation Suite?
Running four actors manually means configuring inputs four times, waiting for each run, downloading intermediate datasets, and writing code to merge results. This actor eliminates that overhead. Configure once, click Start, and get a merged dataset ready for your CRM, outreach tool, or spreadsheet.
The orchestrator also handles data flow between steps intelligently:
- Step 1 (Website Lead Intelligence) is a send-decision engine. Every domain ships with
sendDecision,sendPlan,pipelineValue,firstTouch,buyingCommittee, andplainEnglishSummary— surfaced verbatim in the suite's output, no transformation. - Emails discovered by the Contact Scraper are automatically fed into the Pattern Finder as known samples, improving pattern detection accuracy without re-scraping.
- Contact names from website scraping are passed to the Pattern Finder for email generation, so team members without public emails still get predicted addresses.
- The Qualifier receives all upstream data (emails, phones, contacts, social links, detected patterns) via a
pipelineDataparameter, eliminating redundant extraction work. - Website scraping is disabled in Step 2 since the Contact Scraper already crawled the sites — only GitHub commit search runs as an additional email source.
- Error handling is built in — if the Pattern Finder, Qualifier, or Verifier fails on a particular domain, the pipeline continues with the data it has rather than aborting.
- Step 1 v2.0 inputs pass through. Set
goal,preset,confidenceMode,enableProFallback,compareToPrevRun,crmWebhookUrl,autoFilter, orexportFormatsat suite level and they forward to the scraper unchanged. Default behaviour matches the scraper'sautopreset.
Key Features
- Four-step pipeline in one click — Contact scraping + send-decision, email pattern detection, lead qualification, and email verification run sequentially without manual handoffs.
- Send-decision per lead —
SEND_NOW/VERIFY_FIRST/SKIP/ENRICH_MOREaction with risk level and plain-English reasons. Branch automation on the action enum, never on the prose. - Buying-committee classification — Contacts grouped into
decisionMakers(CEO/founder/C-suite),influencers(VPs/Directors),champions(Sales/BD/Partnerships — most reachable), andblockers(Legal/Finance/Procurement — email last). - Pipeline-value rank — Relative priority within the batch (
rankInBatch: 1= top lead). Use to order outreach. - First-touch opening lines — Deterministic opening-sentence stems per lead (
angle,hook,line) generated from job title + company type. Not LLM-generated copy. - Merged and deduplicated output — Emails, phones, and social links from multiple steps combined into a single record per domain with no duplicates.
- Lead scoring and grading — Every lead gets a 0–100 quality score and letter grade across five categories: contact reachability, business legitimacy, online presence, website quality, and team transparency.
- Email pattern detection — Identifies naming conventions like
first.last@,flast@, orfirst@and generates predicted emails for team members. - Email verification (optional) — Step 4 verifies every discovered + pattern-generated email and emits per-email decisions (
send/send-monitor/hold/replace/suppress). - Scheduled monitoring + change detection — Set
compareToPrevRun: trueand every domain getschangeFlags[](NEW_TEAM_HIRE / TIER_UPGRADED / etc.) plus a per-domain delta block. - CRM auto-push — Set
crmWebhookUrland Step 1 POSTs each enriched lead directly to HubSpot, Salesforce, Zapier, Make.com, or n8n. No glue code. - Outreach-tool CSV exports — Set
exportFormats: ["instantly", "smartlead", "apollo"]to drop ready-to-import CSVs into the run's key-value store. - Auto-filter — Set
autoFilter: "send-now-only"and the dataset only contains green-light leads. - Configurable pipeline — Skip the Pattern Finder or Lead Qualifier to save time and cost when you only need contact data.
- Minimum score + personal-email filtering — Set thresholds to exclude low-quality leads. Filtered domains are excluded from PPE billing.
- Proxy support — Pass proxy settings through to all pipeline steps for reliable scraping across many domains.
- Graceful error handling — If an optional step fails, the pipeline continues and outputs whatever data it successfully gathered.
Don't have URLs? Run the suite from business names or footer phrases
If you only have a list of business names (CRM export, prospect spreadsheet) — or a niche identified by distinctive footer/marketing phrases — leave urls empty and use the knownNames and footerPhrases inputs instead. The suite resolves names + phrases to websites via Google before kicking off the contact-scrape → email-pattern → lead-qualify pipeline, all in one run.
knownNames— list of business names. Each is searched on Google as{name} {nameSuffix}and the top organic result is treated as that company's website.footerPhrases— distinctive marketing phrases that identify a niche (e.g."we buy land in any state"). Each runs as an exact-match Google query and every organic result is collected.nameSuffix— appended to every name query to disambiguate (e.g."we buy land","real estate","plumber"). Strongly recommended — generic names without a suffix can resolve to wrong entities.
The discovery layer calls our Business Name & Phrase to URL Resolver sub-actor ($0.002 per Google query). After resolution, every found domain flows through the standard suite pipeline.
How to Use
-
Add company URLs -- Enter one or more website URLs or bare domains in the "Website URLs or Domains" field. For example:
stripe.com,https://buffer.com,hubspot.com. Each domain produces one enriched lead in the output. Or leave URLs empty and provideknownNames/footerPhrasesinstead (see section above). -
Configure crawl depth -- Set "Max pages per domain" for both the contact scraping step and the lead qualification step. Higher values discover more contacts and signals but increase run time and cost. The default of 5 pages per step works well for most company websites.
-
Choose pipeline steps -- By default, all three steps run. Check "Skip email pattern detection" or "Skip lead qualification" if you only need basic contact data. Skipping both optional steps makes the run roughly 3x faster.
-
Set a minimum score -- If lead qualification is enabled, set a minimum score (0-100) to filter out low-quality leads. Set to 0 to include all leads regardless of score.
-
Configure proxy -- For scraping more than a handful of domains, enable Apify Proxy to avoid rate limiting. The proxy settings are forwarded to all sub-actors automatically.
-
Run and export -- Click "Start" and wait for the pipeline to complete. Download the dataset as JSON, CSV, or Excel, or access it via the Apify API for integration into your workflow.
Input Parameters
Core pipeline
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
urls | String[] | Yes* | — | List of company website URLs or bare domains to process |
knownNames | String[] | No | [] | Business names; resolved to domains via Google when urls is empty |
footerPhrases | String[] | No | [] | Distinctive phrases that identify a niche; each runs as an exact-match Google query |
maxPagesPerDomain | Integer | No | 5 | Max pages to crawl per site during Step 1 contact scraping (1–20) |
maxQualifierPagesPerDomain | Integer | No | 5 | Max pages to crawl per site during Step 3 lead qualification (1–15) |
minScore | Integer | No | 0 | Minimum Step 3 lead-score threshold; leads below this are excluded |
skipEmailPatternFinder | Boolean | No | false | Skip Step 2 (email pattern detection) |
skipLeadQualifier | Boolean | No | false | Skip Step 3 (lead qualification + scoring) |
verifyEmails | Boolean | No | false | Run Step 4 (Bulk Email Verifier) — adds per-email send/hold/replace decisions |
proxyConfiguration | Object | No | Apify Proxy | Proxy settings forwarded to all pipeline steps |
*Either urls or knownNames/footerPhrases is required.
Step 1 + Step 2 shared passthrough
These four inputs forward to BOTH Step 1 (Website Lead Intelligence) and Step 2 (Email Pattern Finder). Same enum values, matching intent at each step.
| Parameter | Type | Default | Description |
|---|---|---|---|
goal | String | sub-actor default | quick-outreach / high-deliverability / max-coverage — sets sensible defaults at every step |
autoFilter | String | none | send-now-only / safe-only / max-leads — drops records that don't pass the filter before they hit the dataset OR billing. max-leads is normalized to max-coverage for Step 2 |
compareToPrevRun | Boolean | false | Monitoring mode. Step 1 emits contact-side changeFlags[] + delta block. Step 2 emits pattern-side changeSinceLastRun + driftState + patternStabilityScore. |
monitorStateKey | String | auto | Key-value store name; auto-derived from input domains when blank |
Step 1 — Website Lead Intelligence passthrough
| Parameter | Type | Default | Description |
|---|---|---|---|
preset | String | scraper default | auto / fast / balanced / maximum — execution depth |
confidenceMode | String | scraper default | safe / balanced / aggressive — risk appetite for which emails ship |
deepScan | Boolean | scraper default | Probe hidden pages (/imprint, /privacy-policy, etc.) |
enableProFallback | Boolean | false | Auto-retry JS-heavy / Cloudflare-protected sites in real browser ($0.35/site) |
requirePersonalEmail | Boolean | false | Drop domains without a personal email; filtered domains not billed |
companyTypes | String[] | [] | Filter by classified company type (saas, agency, legal, etc.) |
crmWebhookUrl | String (secret) | — | HTTPS endpoint that receives one POST per enriched lead |
crmFormat | String | generic-json | generic-json / hubspot / salesforce |
crmOnlyTierA | Boolean | false | Only push tier-A leads (verified personal email + senior contact) to the CRM |
exportFormats | String[] | [] | instantly / smartlead / apollo — generates ready-to-import CSVs in the run's key-value store |
Step 2 — Email Pattern Finder passthrough
Step 2 already gets discovered emails + contact names from Step 1 automatically. These inputs add extra data sources or change behaviour.
| Parameter | Type | Default | Description |
|---|---|---|---|
searchWhois | Boolean | false | Look up domain registration data for registrant emails. Best for smaller companies where the owner's email is in the WHOIS record |
hunterApiKey | String (secret) | — | Hunter.io API key for additional Step 2 email discovery. Free tier gives 25 searches/month |
Step 3 — B2B Lead Qualifier passthrough
Step 3 already gets all upstream data (emails, phones, contacts, social links, detected patterns) from Steps 1+2 via pipelineData automatically. These inputs change how Step 3 scores or surfaces results.
| Parameter | Type | Default | Description |
|---|---|---|---|
scoringProfile | String | default | default / sales / marketing / recruiting. Adjusts category weights — Sales emphasizes contact reachability + decision makers; Marketing emphasizes online presence + website quality; Recruiting emphasizes team transparency |
watchlistName | String | — | Name this run as a separate watchlist. Score history is stored per-watchlist, so you can run the suite as N independent watchlists (e.g. tier-1-prospects, churn-risk-accounts) |
qualifierWebhookUrl | String (secret) | — | Slack or Discord incoming webhook URL. On run completion, Step 3 posts a rich embed with the top scored leads + a link to the Apify run. Distinct from Step 1's crmWebhookUrl (per-record CRM push) |
qualifierCircuitBreakerThreshold | Integer | 0 | Abort the Step 3 sub-actor if this many consecutive domains fail to fetch (e.g. proxy outage). 0 disables the breaker. Recommended: 5–10 for large batches |
Input Examples
Full pipeline with quality filter (most common use case):
{"urls": ["stripe.com", "hubspot.com", "notion.so", "linear.app", "cal.com"],"maxPagesPerDomain": 8,"maxQualifierPagesPerDomain": 5,"minScore": 50}
Cold-outreach mode — only ship send-ready leads:
{"urls": ["stripe.com", "hubspot.com", "linear.app"],"goal": "high-deliverability","autoFilter": "send-now-only","verifyEmails": true}
Maximum coverage — every possible lead including pattern-generated:
{"urls": ["buffer.com", "zapier.com", "cal.com"],"goal": "max-coverage","confidenceMode": "aggressive"}
JS-heavy / Cloudflare-protected sites — auto-retry with browser rendering:
{"urls": ["fancy-spa-site.com", "cloudflare-protected.com"],"preset": "auto","enableProFallback": true}
Scheduled monitoring — diff against last week's run:
{"urls": ["acmecorp.com", "globex.io", "initech.com"],"compareToPrevRun": true,"monitorStateKey": "us-saas-watchlist-2026"}
Sales-tuned scoring profile + per-list watchlist history:
{"urls": ["acmecorp.com", "globex.io", "initech.com"],"scoringProfile": "sales","watchlistName": "tier-1-prospects-q2","qualifierWebhookUrl": "https://hooks.slack.com/services/..."}
Auto-push to HubSpot — hands-off lead routing:
{"urls": ["stripe.com", "hubspot.com"],"crmWebhookUrl": "https://api.hubapi.com/crm/v3/objects/contacts?hapikey=YOUR_KEY","crmFormat": "hubspot","crmOnlyTierA": true}
Outreach-tool CSV exports — drop straight into Instantly + Smartlead:
{"urls": ["stripe.com", "hubspot.com", "notion.so"],"exportFormats": ["instantly", "smartlead"]}
Contact scraping only (fastest, cheapest):
{"urls": ["https://example.com", "https://acme.co"],"maxPagesPerDomain": 10,"skipEmailPatternFinder": true,"skipLeadQualifier": true}
Contacts + patterns, no scoring:
{"urls": ["buffer.com", "zapier.com"],"skipLeadQualifier": true}
Input Tips
- Bare domains like
stripe.comand full URLs likehttps://stripe.comboth work -- the actor normalizes them automatically. - For SaaS prospecting, use 5-8 pages per domain to catch team/about pages where contacts are listed.
- For large enterprise sites, increase
maxPagesPerDomainto 15-20 to reach contacts buried in deep navigation. - Set
minScoreto 60+ when feeding results to outreach tools -- this removes placeholder sites and parked domains. - Enable proxy when processing more than 10 domains in a single run to avoid rate limiting.
Output Example
Each domain produces one enriched lead record. Here is a representative example with all pipeline steps enabled:
{"domain": "apify.com","url": "https://apify.com","emails": ["info@apify.com", "support@apify.com", "jan@apify.com"],"personalEmails": ["jan@apify.com"],"genericEmails": ["info@apify.com", "support@apify.com"],"phones": ["+420 255 000 222"],"contacts": [{ "name": "Jan Curn", "title": "CEO & Co-founder", "email": "jan@apify.com" },{ "name": "Ondra Urban", "title": "CTO & Co-founder" }],"socialLinks": {"twitter": "https://twitter.com/apify","linkedin": "https://www.linkedin.com/company/apifytech","github": "https://github.com/apify","youtube": "https://www.youtube.com/c/Apify"},"addresses": ["Vodickova 704/36, 110 00 Prague, Czech Republic"],"companyMeta": { "name": "Apify", "industry": "Software", "language": "en" },"companyType": "saas","sendDecision": {"action": "SEND_NOW","riskLevel": "low","reasons": ["Verified personal email present", "Senior contact identified", "No catch-all flag"]},"sendPlan": {"status": "ready","channel": "email-first","safeToAutomate": true,"openingAngle": "product/platform — pitch the developer-tooling angle","followUpStrategy": "2 follow-ups, 3 days apart, then mark not interested"},"pipelineValue": { "relativeScore": 1.0, "rankInBatch": 1 },"firstTouch": {"angle": "product-side","hook": "Apify operates a developer platform — partnerships likely exposed to external pipeline","line": "Saw Jan's CEO role at Apify — quick idea on the developer-platform / partnerships side"},"buyingCommittee": {"decisionMakers": [{ "name": "Jan Curn", "title": "CEO & Co-founder", "email": "jan@apify.com", "seniority": 100, "reachable": true }],"influencers": [{ "name": "Ondra Urban", "title": "CTO & Co-founder", "seniority": 100, "reachable": false }],"champions": [],"blockers": [],"size": 2},"topContacts": [{ "name": "Jan Curn", "title": "CEO & Co-founder", "email": "jan@apify.com", "score": 95, "reasons": ["CEO seniority", "Personal email verified"] }],"bestContact": { "name": "Jan Curn", "title": "CEO & Co-founder", "email": "jan@apify.com" },"decision": { "tier": "A", "reason": "Verified personal email + senior contact" },"leadScore": 92,"dataQuality": "high","isContactable": true,"contactFormDetected": true,"catchAllDetected": false,"domainPurity": 100,"plainEnglishSummary": "Best person to email at Apify is Jan Curn (CEO). Email is verified and safe — you can reach out now.","whyThisLead": ["Founder accessible via personal email", "Active social media presence indicates outbound posture"],"confidence": { "emailConfidence": 95, "contactConfidence": 90, "overallConfidence": 92, "riskFlags": [] },"coverage": { "emails": "complete", "contacts": "complete", "phones": "found", "socials": "complete", "addresses": "found", "contactForm": true },"emailPattern": "first@apify.com","emailPatternConfidence": 0.85,"alternateEmailPatterns": [{ "pattern": "first.last@apify.com", "confidence": 0.42 },{ "pattern": "flast@apify.com", "confidence": 0.28 }],"generatedEmails": [{ "name": "Ondra Urban", "email": "ondra@apify.com", "pattern": "first@apify.com", "confidence": 0.85 }],"patternAnalysis": {"confidenceLevel": "high","isSendable": true,"isContactable": true,"bounceRiskBucket": "low","isCatchAll": false,"mxValid": true,"mxRecord": ["10 aspmx.l.google.com", "20 alt1.aspmx.l.google.com"],"sendDecision": {"action": "SEND_NOW","riskLevel": "low","reasons": ["6 emails analyzed", "single dominant pattern", "MX valid", "not catch-all"]},"recommendedSequence": ["first@apify.com", "first.last@apify.com", "flast@apify.com"],"emailCulture": "strict-format","patternStabilityScore": 1.0,"decisionSignals": ["high-confidence", "sample-rich", "multi-source", "strict-format", "stable-pattern", "mx-valid"],"negativeSignals": [],"plainEnglishSummary": "Pattern is `first@apify.com` (85% confidence, 6 samples). Safe to send to generated emails."},"score": 88,"grade": "A","scoreBreakdown": {"contactReachability": 22,"businessLegitimacy": 20,"onlinePresence": 18,"websiteQuality": 16,"teamTransparency": 12},"signals": [{ "signal": "Multiple email addresses found", "category": "contactReachability", "points": 10, "detail": "3 emails discovered" },{ "signal": "Phone number present", "category": "contactReachability", "points": 8, "detail": "+420 255 000 222" },{ "signal": "Social media profiles found", "category": "onlinePresence", "points": 8, "detail": "4 platforms linked" },{ "signal": "Team members listed", "category": "teamTransparency", "points": 10, "detail": "2 named contacts with titles" }],"address": "Vodickova 704/36, 110 00 Prague, Czech Republic","cmsDetected": "Next.js","techSignals": ["React", "Next.js", "Google Analytics", "Intercom"],"industry": "Software / Developer Tools","jobCount": 12,"qualifierAnalysis": {"summary": "Strong B2B SaaS lead — verified personal contact, full team transparency, hiring actively. Outreach immediately.","scoreExplanation": "Tier-A signals: 3 personal emails, 2 named contacts with senior titles, full social presence, modern tech stack, and 12 open roles indicating active growth.","confidence": { "score": 92, "level": "high", "components": [{ "name": "signal-breadth", "weight": 0.4, "value": 95 }, { "name": "crawl-depth", "weight": 0.3, "value": 90 }, { "name": "source-integrity", "weight": 0.3, "value": 90 }] },"recommendedAction": "outreach-immediately","previousScore": null,"scoreChange": null,"changeFlag": "NEW","jsWarning": null,"botProtection": { "detected": false, "vendor": null },"dataGaps": [],"agentContract": { "decision": "qualified-A", "confidence": 92, "nextAction": "outreach-immediately", "costToAct": 0 },"recordType": "result","schemaVersion": "2.0.0","eventId": "sha256-abc123...","failureType": null,"shouldOutreach": true,"decisionSignals": ["grade-a", "high-score", "outreach-immediately", "new", "high-confidence", "has-contact-signals", "has-legitimacy-signals", "has-presence-signals", "has-quality-signals", "has-team-signals", "multi-source", "has-emails", "has-phones", "has-contacts", "rich-socials", "hiring-active", "industry-classified"],"negativeSignals": [],"confidenceConflict": null,"failureContext": null,"methodology": "Lead score and recommendedAction are heuristic-derived from observable website signals — not produced by a trained model."},"verifiedEmails": [{ "email": "jan@apify.com", "status": "valid", "confidence": 0.92, "decision": "send", "actionId": "send", "failureCategory": null }],"topVerifiedEmail": "jan@apify.com","topEmailDecision": "send","sendableEmailCount": 1,"pipelineSteps": ["contact-scraper", "email-pattern-finder", "lead-qualifier", "bulk-email-verifier"],"processedAt": "2026-05-06T14:30:00.000Z"}
Output Fields
Step 1 — Website Lead Intelligence (send-decision engine)
| Field | Type | Description |
|---|---|---|
domain | String | Normalized company domain |
url | String | Full URL of the website |
emails | String[] | All discovered email addresses (Step 1 + Step 3 union) |
personalEmails | String[] | Personal addresses (not info@/hello@) |
genericEmails | String[] | Role-based addresses (info@, hello@, contact@) |
phones | String[] | All discovered phone numbers |
contacts | Object[] | Named contacts with name, title, optional email |
socialLinks | Object | Social profile URLs keyed by platform |
addresses | Array | Physical addresses (schema.org PostalAddress, JSON-LD, <address> elements) |
companyMeta | Object|null | Company name, description, industry, logo, employee count, founding date |
companyType | String|null | Classified type: saas, agency, consulting, legal, accounting, ecommerce, healthcare, real_estate, etc. |
sendDecision | Object|null | { action: 'SEND_NOW' / 'VERIFY_FIRST' / 'SKIP' / 'ENRICH_MORE', riskLevel, reasons }. Branch automation on action. |
sendPlan | Object|null | { status, channel, safeToAutomate, openingAngle, followUpStrategy, ... } — sequence-ready execution plan |
pipelineValue | Object|null | { relativeScore (0–1), rankInBatch (1 = best), ... } — relative priority within this batch |
firstTouch | Object|null | Opening-line stem: { angle, hook, line }. Deterministic from job-title + company-type, not LLM copy |
decision | Object|null | Outreach readiness tier: { tier: 'A'/'B'/'C', reason } |
leadScore | Number|null | Step 1's own 0–100 score (distinct from Step 3's score) |
dataQuality | String|null | high / medium / low / no-data |
bestContact | Object|null | Highest-ranked contact (name, title, email, score) |
topContacts | Object[] | Top-3 ranked contacts with reasons; backup options beyond bestContact |
buyingCommittee | Object|null | { decisionMakers, influencers, champions, blockers, size } |
plainEnglishSummary | String|null | One-sentence takeaway, paste-ready for Slack/email |
whyThisLead | String[] | Plain-English intent signals |
confidence | Object|null | { emailConfidence, contactConfidence, overallConfidence, riskFlags, components }. components is an explainable breakdown — { emailEvidence, contactEvidence, verificationLift, catchAllPenalty, riskPenalty, multipleSamplesBonus, finalScore } — parallels Step 2's confidenceBreakdown for consistent debugging across the suite |
coverage | Object|null | Per-signal completeness: emails, contacts, phones, socials, addresses, contactForm |
summary | Object|null | Flat scanning block: primaryEmail, primaryContact, title, decision, confidence, leadScore |
isContactable | Boolean|null | True when domain has at least one personal email or bestContact with email |
contactFormDetected | Boolean|null | True when an inquiry form was found |
catchAllDetected | Boolean|null | True when domain accepts mail to any address |
catchAllImplication | String|null | Plain-English consequence of catch-all flag |
domainPurity | Number|null | % of emails that match the website's root domain (0–100) |
botProtection | Object|null | Detected anti-bot service (cloudflare/datadome/akamai/etc.) with recommendation |
failureType | String|null | no-data / blocked / timeout / js-required / parse-error. Null on success |
scrapeError | String|null | Error message if all retries failed |
jsWarning | String|null | Warning when a JavaScript-heavy site was detected |
recommendation | String|null | Actionable next step (e.g., 'Try deepScan=true', 'Use Pro fallback for JS sites') |
recoveryPlan | Object|null | { nextBestTool, nextBestActorSlug, method, confidence } for failed/thin records |
bounceRiskBucket | String|null | Explicit low / medium / high band — filter directly instead of composing from confidence + riskFlags + catchAllDetected. Matches Step 2's same-named field for cohesive multi-step filtering |
decisionSignals | String[] | Stable, additive-only enum tokens for SQL/agent filters (high-confidence / multi-source / personal-email-found / tier-a / send-now / catch-all / etc.). Distinct from signals[] which is scoring evidence with points |
negativeSignals | String[] | Concrete reasons this lead might bounce or burn sender reputation. Empty array = no concerns. Distinct from confidence.riskFlags which mixes positive + negative concepts |
confidenceConflict | Object|null | { exists, reason } when signals disagree (high confidence + catch-all, single-sample inflated confidence, senior contact with no email, etc.) |
failureContext | Object|null | { confidenceLossReason, retryLikelihood } when extraction failed or confidence is low — would re-running help? |
methodology | String|null | Disclosure: scoring is heuristic-derived, not produced by a trained model. Surfaced for AI/agent buyers auditing for hallucination risk |
isSendable | Boolean|null | Convenience boolean — true when sendDecision.action === 'SEND_NOW'. Filter on this in spreadsheets without parsing the sendDecision object |
changeFlags | String[] | Stable change codes when monitoring is on (NEW_TEAM_HIRE, TIER_UPGRADED, etc.) |
changeSinceLastRun | Object|null | Per-domain delta block: addedEmails, removedContacts, leadScoreDelta, decisionTierBefore/After, daysSinceLastSeen |
firstSeenAt | String|null | ISO timestamp — first time domain observed across monitor runs |
lastSeenAt | String|null | ISO timestamp — most recent observation |
crmPushResult | Object|null | Per-record outcome of CRM auto-push when crmWebhookUrl is set |
Step 2 — Email Pattern Finder
| Field | Type | Description |
|---|---|---|
emailPattern | String|null | Detected email naming convention (e.g., first.last@domain.com). Null if Step 2 skipped |
emailPatternConfidence | Number|null | Pattern confidence from 0 to 1 |
alternateEmailPatterns | Object[] | Other plausible patterns with lower confidence — { pattern, confidence }. Useful for cold-email tools that retry on bounce |
generatedEmails | Object[] | Predicted emails for contacts whose addresses were not publicly found |
patternAnalysis | Object|null | Step 2's full v2 decision-engine output, namespaced to avoid collision with Step 1's identically-named fields. Null when Step 2 skipped or no record found for this domain. |
The patternAnalysis block contains:
Field (under patternAnalysis) | Type | Description |
|---|---|---|
confidenceLevel | String|null | Banded label: high (≥ 0.75), medium (≥ 0.5), low (< 0.5) |
isSendable | Boolean|null | True when Step 2's sendDecision.action is SEND_NOW |
isContactable | Boolean|null | True when domain has valid MX AND at least one real or generated email |
bounceRiskBucket | String|null | low / medium / high — derived from confidence + catch-all + MX |
isCatchAll | Boolean|null | True if the domain accepts mail to any address (SMTP verification unreliable) |
mxValid | Boolean|null | True if the domain has valid MX records |
mxRecord | String[] | DNS MX records sorted by priority |
sendDecision | Object|null | { action, riskLevel, reasons } — Step 2's own action enum (different scope from Step 1's: "trust the pattern?" vs Step 1's "email this domain?") |
recoveryPlan | Object|null | When pattern detection fails: next-best Apify actor to chain into |
confidenceBreakdown | Object|null | Explainable components: samplesContribution, sourceDiversity, patternConsistency, catchAllPenalty, temporalStability, finalScore |
recommendedSequence | Array | Ranked list of pattern templates to try in order — primary first, alternates by domain match strength |
recommendedSequenceWithScores | Array | Same as recommendedSequence with per-pattern scores attached |
emailCulture | String|null | strict-format (single dominant pattern, ≥85%), loose (multiple competing, <60%), or mixed |
patternStabilityScore | Number|null | 0..1 weighted-recency score across this domain's run history. Computed only when compareToPrevRun is on |
catchAllStrategy | Object|null | Non-null only on catch-all domains. Provides rankedPatterns + recommendedSendOrder + rationale + coverage hint |
decisionSignals | String[] | Stable, additive-only enum tokens summarising why the decision landed where it did (high-confidence / sample-rich / multi-source / stable-pattern / volatile-pattern / strict-format / catch-all / no-mx / single-source / etc.) |
negativeSignals | Array | Concrete reasons this record might bounce or burn sender reputation |
confidenceConflict | Object|null | Surfaces when signals disagree (high pattern confidence + low stability, single-sample high confidence on a catch-all, etc.) |
failureContext | Object|null | When confidence is low or pattern detection fails: confidenceLossReason + retryLikelihood |
sequenceStrategy | Object|null | How to use recommendedSequence — single-shot / fallback / progressive |
driftState | Object|null | Cross-run drift summary: status (stable/emerging/unstable/unknown), volatilityScore, lastChangeType |
plainEnglishSummary | String|null | Step 2's one-line Slack-ready summary (distinct from Step 1's plainEnglishSummary) |
methodology | String|null | Disclosure: pattern is heuristic-derived, not produced by a trained model |
failureType | String|null | Categorised failure reason from Step 2 (distinct from Step 1's failureType) |
dataQuality | String|null | Step 2's reliability indicator: high (5+ emails), medium (2–4), low (1), no-data |
jsWarning | String|null | Non-null when company website appears to be JS-rendered SPA AND contributed 0 emails to Step 2 |
blockedDetected | Boolean|null | True when company website returned anti-bot block markers AND contributed 0 emails to Step 2 |
changeSinceLastRun | Object|null | Step 2's per-domain delta block when monitoring is on (PATTERN_CHANGED, NEW_EMAILS_FOUND, CATCH_ALL_FLIPPED_ON, etc.) |
Step 3 — B2B Lead Qualifier
| Field | Type | Description |
|---|---|---|
score | Number|null | Lead quality score 0–100. Null if Step 3 skipped |
grade | String|null | Letter grade: A (90–100), B (75–89), C (60–74), D (40–59), F (0–39) |
scoreBreakdown | Object|null | Points per category: contactReachability (30), businessLegitimacy (25), onlinePresence (20), websiteQuality (15), teamTransparency (10) |
signals | Object[] | Individual scoring signals with signal, category, points, detail |
address | String|null | Single physical business address (Step 3's extraction). See addresses[] for Step 1's array |
cmsDetected | String|null | Detected CMS or framework (WordPress, Shopify, Next.js, etc.) |
techSignals | String[] | Technologies and tools detected on the website |
industry | String|null | Detected industry classification |
jobCount | Number|null | Number of open roles found on /careers, /jobs pages — hiring-velocity signal |
qualifierAnalysis | Object|null | Step 3's full v2 decision-engine output, namespaced to avoid collision with Step 1's identically-named fields. Null when Step 3 skipped or no record found for this domain. |
The qualifierAnalysis block contains:
Field (under qualifierAnalysis) | Type | Description |
|---|---|---|
summary | String|null | Plain-English one-line summary (≤280 chars), LLM/CRM-friendly |
scoreExplanation | String|null | Plain-English explanation of why this lead got its score |
confidence | Object|null | { score, level: 'high' / 'medium' / 'low' / 'very-low', components[] } — captures signal breadth, crawl depth, source integrity. Distinct shape from Step 1's confidence (different scope) |
recommendedAction | String|null | outreach-immediately / add-to-nurture / enrich-then-revisit / manual-review / archive — Step 3's qualifier-axis decision. Different scope from Step 1's sendDecision.action (which is deliverability-axis). Both are useful: read sendDecision for "can I email?", read qualifierAnalysis.recommendedAction for "is this lead worth pursuing?" |
previousScore | Number|null | Score from the previous run (null if first run) |
scoreChange | Number|null | Change from previous score (positive = improved) |
changeFlag | String|null | NEW / IMPROVED / DECLINED / UNCHANGED. Based on previousScore vs current score with ±5 tolerance |
jsWarning | String|null | Step 3's JS-warning message. Distinct from Step 1's jsWarning (different scope: Step 3's is signal-extraction completeness, Step 1's is contact-extraction completeness) |
botProtection | Object|null | Step 3's bot-protection detection: { detected, vendor } |
dataGaps | Object[] | Step 3's parallel to Step 1/2's recoveryPlan: [{ field, reason, suggestedFix }] — missing fields with reasons + suggested upstream actor to fill the gap. Use as automation routing signal |
agentContract | Object|null | Flat MCP-ready surface for AI consumers: { decision: 'qualified-A' / 'qualified-B' / 'review' / 'low-priority' / 'reject', confidence, nextAction, costToAct }. AI agents read this directly without traversing the full record |
recordType | String|null | Discriminator: result for scored leads, error for failure records (error rows are filtered before reaching the merged output) |
schemaVersion | String|null | Output schema version (semver). Bumps on shape changes |
eventId | String|null | Idempotent canonical id (sha256 of watchlist+domain). Same id across re-runs of the same domain |
failureType | String|null | Step 3's failure enum on error records: transient / auth / rate_limit / not_found / schema_mismatch / bot_blocked / unknown |
shouldOutreach | Boolean|null | Convenience boolean — true when recommendedAction === 'outreach-immediately'. Filter on this in spreadsheets to grab the qualified-leads row without parsing the action enum |
decisionSignals | String[] | Step 3's stable enum tokens (parallel to Step 1's decisionSignals): grade-a/b/c/d/f, high-score/medium-score/low-score, outreach-immediately, new/improved/declined/unchanged, high-confidence/medium-confidence, multi-source, bot-protected, js-partial, has-emails/has-phones/has-contacts, hiring-active, gap-emails, etc. |
negativeSignals | String[] | Step 3's negatives-only array (parallel to Step 1's negativeSignals): concrete reasons this lead might not convert. Empty array = no concerns |
confidenceConflict | Object|null | Step 3's signal-disagreement surface (parallel to Step 1's): { exists, reason } when score and confidence disagree |
failureContext | Object|null | Step 3's structured failure context (parallel to Step 1's): { confidenceLossReason, retryLikelihood } |
methodology | String|null | Step 3's heuristic-derivation disclosure (parallel to Step 1's methodology) |
Step 4 — Bulk Email Verifier (when verifyEmails: true)
| Field | Type | Description |
|---|---|---|
verifiedEmails | Object[] | Per-email verification: email, status, confidence, decision, actionId, failureCategory |
topVerifiedEmail | String|null | Highest-confidence email graded send or send-with-monitoring |
topEmailDecision | String|null | Verifier decision for topVerifiedEmail |
sendableEmailCount | Number | Count of emails graded send/send-with-monitoring |
When Step 4 is OFF and Step 1 ran with preset: auto (or any preset that includes verification), verifiedEmails is populated from Step 1's basic verification (status + confidence only; decision/actionId/failureCategory will be null).
Metadata
| Field | Type | Description |
|---|---|---|
pipelineSteps | String[] | Which steps completed: contact-scraper, email-pattern-finder, lead-qualifier, bulk-email-verifier |
processedAt | String | ISO 8601 timestamp when the lead was processed |
Programmatic Access (API)
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("ryanclinton/b2b-lead-gen-suite").call(run_input={"urls": ["stripe.com", "hubspot.com", "notion.so"],"maxPagesPerDomain": 8,"minScore": 50,})for lead in client.dataset(run["defaultDatasetId"]).iterate_items():grade = lead.get("grade", "N/A")score = lead.get("score", "N/A")emails = ", ".join(lead.get("emails", []))print(f'{lead["domain"]} [{grade} {score}] — {emails}')
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const run = await client.actor("ryanclinton/b2b-lead-gen-suite").call({urls: ["stripe.com", "hubspot.com", "notion.so"],maxPagesPerDomain: 8,minScore: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();for (const lead of items) {console.log(`${lead.domain} [${lead.grade} ${lead.score}] — ${lead.emails.join(", ")}`);}
cURL
# Start a runcurl -X POST "https://api.apify.com/v2/acts/ryanclinton~b2b-lead-gen-suite/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"urls": ["stripe.com", "hubspot.com"],"maxPagesPerDomain": 8,"minScore": 50}'# Fetch results (use defaultDatasetId from the run response)curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How It Works — Pipeline Architecture
The B2B Lead Generation Suite runs a 3-step sequential pipeline by default, plus an opt-in 4th verification step. Each step calls a dedicated Apify actor via Actor.call() and waits for it to complete before starting the next step. Steps 2 and 3 are skippable; Step 4 is off by default.
┌──────────────────────────────────────┐│ B2B Lead Generation Suite ││ (Orchestrator) │└──────────┬───────────────────────────┘│┌──────────▼───────────────────────────┐Step 1 (Required) │ Website Lead Intelligence ││ $0.20 per domain with contact data ││ $0 for filtered or empty domains ││ • Extracts emails, phones, contacts ││ • Classifies buying committee ││ (decisionMakers/influencers/ ││ champions/blockers) ││ • Emits sendDecision per domain ││ (SEND_NOW / VERIFY_FIRST / ││ SKIP / ENRICH_MORE) ││ • Generates first-touch line stem ││ • Ranks pipelineValue within batch ││ • Optional: Pro fallback for JS sites ││ ($0.35/site, only when needed) ││ • Optional: monitoring + change ││ detection (changeFlags[]) ││ • Optional: CRM auto-push ││ (HubSpot / Salesforce / Make / ││ Zapier / n8n) │└──────────┬───────────────────────────┘│ emails + contact names┌──────────▼───────────────────────────┐Step 2 (Optional) │ Email Pattern Finder ││ $0.10 per domain analyzed ││ • Receives Step 1 emails as samples ││ • Website scraping DISABLED ││ (already scraped in Step 1) ││ • GitHub commit search ENABLED ││ • Optional: WHOIS / RDAP search ││ (searchWhois) ││ • Optional: Hunter.io API ││ (hunterApiKey) ││ • Detects naming convention ││ • Emits sendDecision per domain ││ (own decision engine, distinct ││ from Step 1's) ││ • bounceRiskBucket + emailCulture ││ + recommendedSequence ││ • catchAllStrategy on catch-all ││ domains (turns dead-end into ││ send sequence) ││ • Generates predicted emails for ││ contacts without addresses │└──────────┬───────────────────────────┘│ all upstream data┌──────────▼───────────────────────────┐Step 3 (Optional) │ B2B Lead Qualifier ││ $0.15 per lead qualified ││ • Receives emails, phones, contacts, ││ social links, and pattern data via ││ pipelineData parameter ││ • Crawls website for quality signals ││ • Scores 0-100 across 5 categories ││ • Assigns letter grade A-F ││ • Emits recommendedAction enum ││ (outreach-immediately / ││ add-to-nurture / ││ enrich-then-revisit / ││ manual-review / archive) ││ • Cross-run change detection ││ (NEW / IMPROVED / DECLINED / ││ UNCHANGED, score history) ││ • dataGaps[] — missing fields + ││ next-best actor to fill the gap ││ • agentContract — flat MCP-ready ││ surface for AI consumers ││ • Optional: scoringProfile ││ (sales/marketing/recruiting) ││ • Optional: watchlist (per-list ││ score history) ││ • Optional: Slack/Discord webhook ││ (run-completion summary) │└──────────┬───────────────────────────┘│ all unique emails (discovered + generated)┌──────────▼───────────────────────────┐Step 4 (Optional) │ Bulk Email Verifier ││ (Outbound Control System) ││ • Verifies every discovered + pattern ││ -generated email via SMTP/MX/DNS ││ • Returns decision per email ││ (send / send-monitor / hold / ││ replace / suppress) ││ • Adds failureAnalysis + ││ recommendedAction per address ││ • Mode: enrichment-validation │└──────────┬───────────────────────────┘│┌──────────▼───────────────────────────┐│ Merge + Deduplicate + Sort + Filter ││ • Union emails/phones across steps ││ • Merge social links (Step 1 priority) ││ • Attach per-email verification + ││ decision to every lead ││ • topVerifiedEmail + topEmailDecision ││ + sendableEmailCount surfaced flat ││ • Sort by score (highest first) ││ • Remove leads below minScore │└──────────────────────────────────────┘
Data Flow Between Steps
Step 1 → Step 2 (Contact Scraper → Pattern Finder):
- Scraped emails are passed as
knownEmailswith matched contact names for attribution. - Contact names are passed as
namesfor email address generation. - Website scraping is disabled (
searchWebsite: false) to avoid redundant crawling. - GitHub commit search stays enabled as an additional email discovery source.
Steps 1+2 → Step 3 (Both → Lead Qualifier):
- All upstream data is packaged into a
pipelineDataarray indexed by domain. - Each entry includes: emails, phones, contacts, social links, detected pattern, and pattern confidence.
- The qualifier uses this to enrich its scoring without re-extracting data that was already found.
Steps 1+2+3 → Step 4 (All → Bulk Email Verifier):
- Every unique email across discovered (Step 1), pattern-generated (Step 2), and qualifier-extracted (Step 3) sources is collected and deduplicated by lowercase form.
- Sent to the verifier in one batch with
mode: "enrichment-validation"(deep SMTP, accept catch-all with monitoring, deliverability simulation on). - The verifier returns one record per email with
status,confidence,decision,recommendedAction, andfailureAnalysis. - The orchestrator maps each verified email back to its lead by matching on the canonicalised address.
Merge Phase:
- Emails from Steps 1 and 3 are unioned via a
Setfor deduplication. - Phones from Steps 1 and 3 are unioned via a
Setfor deduplication. - Social links from Step 1 take priority; Step 3 links fill in missing platforms only.
- Pattern data and scoring data are attached from their respective steps (null if skipped).
- Verification data attaches as
verifiedEmails[]per lead, plus flattopVerifiedEmail/topEmailDecision/sendableEmailCountfor cadence-tool branching. The top verified email is the highest-confidence address gradedsendorsend-with-monitoring. - Results are sorted by score descending, then filtered by
minScore.
Scoring Reference
When Step 3 (Lead Qualifier) runs, each lead is scored across five categories with point caps:
| Category | Max Points | What It Measures |
|---|---|---|
| Contact Reachability | 30 | Email addresses, phone numbers, contact form availability |
| Business Legitimacy | 25 | Physical address, about page, privacy policy, CMS/tech presence |
| Online Presence | 20 | Social media profiles across platforms |
| Website Quality | 15 | SSL, modern CMS, analytics, live chat tools |
| Team Transparency | 10 | Named team members with titles, team/about pages |
Grade scale: A (90-100), B (75-89), C (60-74), D (40-59), F (0-39)
How Much Does It Cost?
The B2B Lead Generation Suite is an orchestrator that calls up to four sub-actors. Each sub-actor is billed pay-per-event by Apify directly to your account — the orchestrator itself does not charge a per-lead fee.
Step 1 — Website Lead Intelligence (always runs): $0.20 per domain with contact data. Domains where no contact data is found are not charged. Filtered domains (requirePersonalEmail, minLeadScore, autoFilter) are also not charged. Optional Pro fallback for JS-heavy sites: $0.35/site, only when triggered.
Step 2 — Email Pattern Finder (optional): $0.10 per domain analyzed. Filtered records (when autoFilter excludes them) are not charged.
Step 3 — B2B Lead Qualifier (optional): $0.15 per lead qualified. Filtered records (when minScore excludes them) are not charged.
Step 4 — Bulk Email Verifier (optional, off by default): sub-actor compute, billed against this run.
| Configuration | Per kept lead |
|---|---|
| Full pipeline with verification (all 4 steps) | $0.45 (Steps 1+2+3) + small compute from Step 4 |
| Steps 1+2+3 (skip email verification) | $0.45 |
| Steps 1+3 (skip Pattern Finder + Verifier) | $0.35 |
| Steps 1+2 (skip Qualifier + Verifier) | $0.30 |
| Step 1 only (skip Pattern + Qualifier + Verifier) | $0.20 |
Domains with no contacts found in Step 1, or records filtered out by requirePersonalEmail / minLeadScore / companyTypes / autoFilter, are not billed — you only pay for leads you actually keep. Step 2 only runs on domains that survived Step 1 filtering.
Worked example — 100 URLs through the default 3-step pipeline
Assume 100 domains in, ~70% return contact data after filtering (typical B2B):
| Step | Charged at | Domains charged | Cost |
|---|---|---|---|
| Step 1 (Website Lead Intelligence) | $0.20 / domain with contact data | 70 | $14.00 |
| Step 2 (Email Pattern Finder) | $0.10 / domain analyzed | 70 (Step 1 survivors) | $7.00 |
| Step 3 (B2B Lead Qualifier) | $0.15 / lead qualified | 70 | $10.50 |
| Total | $31.50 |
That's ~$0.32 per kept lead end-to-end. Set verifyEmails: true and Step 4 adds verifier sub-actor compute (no per-event PPE). Set enableProFallback: true and JS-blocked domains add $0.35 each — typically 0–10% of a B2B batch.
The orchestrator itself uses 256 MB of memory and minimal compute. Apify's free tier includes $5 of monthly platform credits.
Tips
-
Start with a small batch when testing a new set of domains. Run 3–5 URLs through the full pipeline first to verify the output quality before processing hundreds.
-
Use
goalinstead ofpreset+confidenceModeseparately. Setgoal: "quick-outreach"/"high-deliverability"/"max-coverage"and the suite picks sensible defaults for both Step 1 dials. Manual settings still override. -
Branch automation on
sendDecision.action, not the prose. The action enum (SEND_NOW/VERIFY_FIRST/SKIP/ENRICH_MORE) is a stable contract. The plain-Englishreasons[]andplainEnglishSummaryare for humans — never parse them. -
Set
autoFilter: "send-now-only"and the dataset only contains green-light leads. Drops the entire SKIP / ENRICH_MORE pile so your downstream tools don't have to filter again. -
Use the minimum score filter to focus on high-quality leads. Setting
minScoreto 50 or higher eliminates domains with thin contact information or low business legitimacy signals. -
Skip the qualifier for speed when you already know the companies are legitimate (e.g., a curated list from LinkedIn Sales Navigator). Step 1's
decision.tier(A/B/C) is already an outreach-readiness signal — Step 3's score adds website-quality on top. -
Increase pages per domain for large enterprise websites where contacts may be buried deep. Setting
maxPagesPerDomainto 10–15 finds more email addresses on sites with complex navigation structures. -
Schedule weekly runs with
compareToPrevRun: trueto turn the suite into a self-maintaining lead database. New hires, departures, and tier upgrades surface aschangeFlags[]automatically. -
Push directly to your CRM with
crmWebhookUrlto skip the manual export step. Step 1 POSTs each enriched lead in HubSpot/Salesforce field shape (or generic JSON for Make/Zapier/n8n). AddcrmOnlyTierA: trueto keep only ready-to-email leads in your CRM. -
Generate outreach-tool CSVs in one step. Set
exportFormats: ["instantly", "smartlead", "apollo"]and Step 1 writes ready-to-import CSVs to the run's key-value store. Download from the Storage tab and drop straight into your sequence.
Limitations
- HTML-only by default. Steps 1 and 3 use CheerioCrawler (HTML parsing without a browser). For JavaScript-heavy or Cloudflare/DataDome/Akamai-protected sites, set
enableProFallback: true— Step 1 auto-retries those domains via Website Contact Scraper Pro (real-browser rendering, $0.35/site, only triggered when JS is detected AND no contacts were found on the first pass). - Email verification is opt-in. Set
verifyEmails: trueto enable Step 4 (Bulk Email Verifier — Outbound Control System) which validates every discovered + pattern-generated email and attaches adecision(send/send-monitor/hold/replace/suppress),recommendedAction, andfailureCategoryper address. Default is off to keep cost / runtime predictable for users who only want raw lead data. Note: Step 1'spreset: autoalready runs basic verification (status + confidence per email) — Step 4 adds the richer routing decisions on top. - Sequential pipeline. Steps run one after another, not in parallel within the suite. Step 1 itself processes domains in parallel internally. The default 3-step pipeline takes 30–90 seconds per domain; adding Step 4 (verification) extends this to 30–120 seconds. Large batches (500+ domains) may approach timeout limits.
- Sub-actor dependency. This actor calls four Apify actors by name. If any sub-actor is temporarily unavailable or returns unexpected output, that step may fail (optional steps fail gracefully; the Contact Scraper is required).
- Pattern detection needs samples. Email pattern detection accuracy depends on how many sample emails are found in Step 1. Domains with zero or one discovered email produce low-confidence or no pattern results.
- Scoring is deterministic. Lead scores are based on observable website signals (presence of emails, social links, team pages, etc.), not AI analysis. The score reflects data availability, not company quality.
- Monitoring requires opt-in baseline. Set
compareToPrevRun: trueand the first run establishes a baseline — every domain is flaggedNEW_DOMAIN. From the second run on, deltas surface aschangeFlags[]andchangeSinceLastRun. Pair with Apify Schedules for daily/weekly monitoring.
Responsible Use
This actor collects publicly visible information from company websites. Follow these guidelines:
- Comply with applicable laws. Check GDPR, CAN-SPAM, CCPA, and local regulations before using collected data for outreach. Presence of contact information on a website does not constitute consent to receive marketing communications.
- Respect robots.txt and rate limits. The actor uses CheerioCrawler with configurable concurrency and rate limiting. Default settings are conservative, but consider lowering crawl depth for websites that explicitly restrict scraping.
- Generated emails are predictions. Emails produced by the Pattern Finder are algorithmic guesses based on detected naming conventions. They should be verified before use and should never be used for bulk unsolicited messaging.
- Do not scrape sensitive sites. Avoid using this tool on government agencies, healthcare providers, educational institutions, or other organizations where automated data collection may violate terms of service or regulations.
FAQ
How long does the full pipeline take per domain? With default settings (5 pages per step), the default 3-step pipeline typically completes in 30–90 seconds per domain. Enabling Step 4 (verification) extends this to 30–120 seconds. Processing runs sequentially through the steps, so skipping optional steps reduces run time proportionally.
Can I process hundreds of domains in one run? Yes. The actor passes the full list of URLs to each sub-actor, which processes them in parallel internally. Very large batches (500+ domains) may approach the default 2-hour timeout. For extremely large lists, consider splitting into batches of 100–200 domains.
What happens if one sub-actor fails? The Contact Scraper is required and will abort the run if it fails entirely. The Email Pattern Finder, Lead Qualifier, and Bulk Email Verifier are optional — if any fails, the pipeline continues and outputs the data from the steps that succeeded. Individual domain failures within a step do not block other domains.
What does Step 1 actually produce?
Step 1 (Website Lead Intelligence) is a send-decision engine. Per domain it returns: a sendDecision action enum (SEND_NOW / VERIFY_FIRST / SKIP / ENRICH_MORE), a sendPlan (channel, safeToAutomate, follow-up strategy), pipelineValue (relative rank within the batch), firstTouch (opening-line stem), buyingCommittee (decisionMakers/influencers/champions/blockers), plainEnglishSummary, plus the underlying contacts/emails/phones/socials. Branch automation on sendDecision.action, never on the prose.
How is Step 1's decision tier different from Step 3's score and grade?
Step 1's decision.tier (A/B/C) measures outreach readiness — verified personal email + senior contact = A. Step 3's score (0–100) and grade (A–F) measure website quality + business legitimacy signals. They're different axes; both are useful. Filter on Step 1 to gate cold outreach, filter on Step 3 to remove parked domains and shells.
Is the email pattern detection accurate?
The Email Pattern Finder analyses discovered emails to reverse-engineer the naming convention. Confidence scores above 0.7 are generally reliable. The patternAnalysis.confidenceLevel band (high ≥ 0.75, medium ≥ 0.5, low < 0.5) and bounceRiskBucket give you stable filters. The accuracy depends on how many sample emails were found — more samples mean higher confidence. Generated emails are predictions, not verified addresses.
Why does Step 2 emit its own sendDecision? Isn't that Step 1's job?
Both sub-actors emit sendDecision, but they answer different questions. Step 1 asks "should I email this domain right now?" — based on whether a verified personal email + senior contact exist. Step 2 asks "should I trust the detected pattern enough to send to generated emails?" — based on sample count, source diversity, catch-all status, MX validity, and pattern stability. Both useful, different scopes. The suite namespaces Step 2's decision-engine output under patternAnalysis to avoid collision: read sendDecision for Step 1, patternAnalysis.sendDecision for Step 2.
What's recommendedSequence for?
When Step 2 detects a pattern but has multiple plausible candidates (e.g., first.last@ 0.6 confidence, flast@ 0.4, first@ 0.3), patternAnalysis.recommendedSequence returns them ranked. Cold-email tools like Instantly and Smartlead can use this for bounce-retry strategies: try the primary first; on bounce, try the next pattern. patternAnalysis.sequenceStrategy tells you how to use it (single-shot / fallback / progressive).
What's catchAllStrategy and when does it appear?
On catch-all domains (where SMTP verification accepts every address), patternAnalysis.catchAllStrategy is non-null and provides a ranked send order with rationale instead of just flagging the domain dead. Pair with patternAnalysis.recommendedSequenceWithScores for probabilistic fallback. This turns catch-all domains from "skip" into "actionable send sequence with real risk-adjusted ordering."
Step 3 emits its own recommendedAction. How does it differ from Step 1's sendDecision.action?
Different decision axes. Step 1's sendDecision.action (SEND_NOW / VERIFY_FIRST / SKIP / ENRICH_MORE) is the deliverability answer: "can I email this domain right now without burning sender reputation?". Step 3's recommendedAction (outreach-immediately / add-to-nurture / enrich-then-revisit / manual-review / archive) is the prioritization answer: "is this lead worth pursuing at all?" — based on website-quality + business-legitimacy signals. Both useful, different scopes. For full automation: filter on sendDecision.action === 'SEND_NOW' AND qualifierAnalysis.recommendedAction === 'outreach-immediately'.
What's scoringProfile and when should I switch from default?
The qualifier scores leads across 5 categories with default weights. scoringProfile: 'sales' re-weights toward contact reachability + decision makers (use for outbound sales lists). scoringProfile: 'marketing' re-weights toward online presence + website quality (use for ABM / content syndication targets). scoringProfile: 'recruiting' re-weights toward team transparency + contact info (use when prospecting candidates' employers).
What's dataGaps[] and how is it different from Step 1's recoveryPlan?
Both fill the same role: "this record is incomplete, here's how to recover." Step 1's recoveryPlan is a single object pointing at a specific next-best actor. Step 3's qualifierAnalysis.dataGaps[] is an ARRAY of { field, reason, suggestedFix } entries — multiple gaps surfaced individually so automation can branch per gap (e.g., missing email → run Email Pattern Finder; missing phone → run Phone Number Finder; missing social links → run a different actor).
What's qualifierAnalysis.agentContract and how do I use it?
A flat MCP-ready surface for AI agents: { decision: 'qualified-A' / 'qualified-B' / 'review' / 'low-priority' / 'reject', confidence: 0-100, nextAction: <RecommendedAction enum>, costToAct: <USD> }. Agents read this directly without traversing score, grade, recommendedAction, scoreBreakdown separately. Pair with Step 1's sendDecision for a complete agent decision surface in two field reads.
How does the per-watchlist scoreChange + changeFlag work?
Set watchlistName: "tier-1-prospects" on a scheduled run. Step 3 stores per-domain score history under that watchlist key. On the next run, every record gets previousScore + scoreChange (delta) + changeFlag (NEW / IMPROVED / DECLINED / UNCHANGED, with ±5 tolerance). Use changeFlag === 'IMPROVED' to surface accounts that just got hotter; use changeFlag === 'DECLINED' to flag churn risk. Run separate watchlists (tier-1-prospects vs churn-risk-accounts) to maintain independent histories.
What data is passed between pipeline steps?
The orchestrator feeds data forward intelligently. Step 1 emails become "known samples" for Step 2 pattern detection. Contact names from Step 1 become candidates for Step 2 email generation. All upstream data (emails, phones, contacts, social links, patterns) feeds into Step 3 via a pipelineData parameter so the qualifier doesn't re-extract data already found. All unique emails (discovered + generated) feed into Step 4.
What's enableProFallback and when should I use it?
By default Step 1 uses CheerioCrawler (HTML parsing, fast, cheap). When a target site is JavaScript-heavy (React/Next.js/Vue) or sits behind Cloudflare/DataDome/Akamai, Cheerio can return empty results. Set enableProFallback: true and Step 1 auto-retries those specific domains via Website Contact Scraper Pro (real-browser rendering, $0.35/site). It only triggers when JS is detected AND no contacts were found — you don't pay $0.35 on every domain.
How does monitoring mode work?
Set compareToPrevRun: true and Step 1 stores a per-domain snapshot in a key-value store. The first run establishes the baseline (every domain gets NEW_DOMAIN flag). From the second run on, Step 1 diffs against the prior baseline and emits changeFlags[] (NEW_TEAM_HIRE / TIER_UPGRADED / TEAM_DEPARTURE / etc.) plus a changeSinceLastRun delta block per domain. Pair with Apify Schedules for daily/weekly automation. Re-runs on the same input list use the same baseline automatically; override monitorStateKey if your input list shifts but you want to maintain history.
Can I run just the contact scraper without the other steps?
Yes. Set skipEmailPatternFinder: true and skipLeadQualifier: true (Step 4 is already off by default). This gives you Step 1's full send-decision output (sendDecision, buyingCommittee, firstTouch, etc.) at $0.20 per domain with contact data. Alternatively, use Website Lead Intelligence (the Step 1 sub-actor) directly.
Integrations
The B2B Lead Generation Suite works with the full Apify platform ecosystem:
-
Apify API -- Trigger pipeline runs programmatically and retrieve enriched leads as JSON via
https://api.apify.com/v2/acts/ryanclinton~b2b-lead-gen-suite/runs. Build automated prospecting workflows that enrich new domains nightly. -
Zapier -- Connect to 5,000+ apps. Trigger a lead enrichment run when a new company is added to your CRM, then push the scored results back into Salesforce, HubSpot, or Pipedrive automatically.
-
Make (Integromat) -- Build multi-step workflows that take prospect lists from Google Sheets, run them through the pipeline, filter by score, and route qualified leads into outreach sequences.
-
Google Sheets -- Export the enriched dataset directly to Google Sheets. Key fields like domain, emails, score, grade, and contacts map cleanly into spreadsheet columns for team review and collaboration.
-
Webhooks -- Configure a webhook URL to receive enriched leads as soon as the pipeline completes, enabling real-time lead routing into custom applications.
-
Scheduled Runs -- Set up daily or weekly schedules to process new batches of prospect domains automatically. Combine with the Apify dataset API to stream results into your data warehouse.
Related Actors
Build a complete B2B sales intelligence stack by combining this actor with other tools from ryanclinton on the Apify Store:
| Actor | What It Does | How It Complements This Suite |
|---|---|---|
| Website Lead Intelligence (formerly Website Contact Scraper) | Send-decision engine: extracts emails + verifies + ranks decision-makers + classifies buying committee + generates first-touch line — SEND_NOW / VERIFY_FIRST / SKIP / ENRICH_MORE per domain | Step 1 of this pipeline. Use standalone when you only need send-ready leads without scoring or pattern detection ($0.20/domain with contact data) |
| Email Pattern Finder | Detects email naming convention + emits its own send-decision engine (sendDecision, bounceRiskBucket, catchAllStrategy, recommendedSequence, emailCulture, driftState) — $0.10/domain analyzed | Step 2 of this pipeline. Use standalone when you already have emails or want pattern intelligence without Step 1's full contact crawl |
| B2B Lead Qualifier | Scores 0-100 + A-F grade across 5 weighted categories. Emits recommendedAction enum, cross-run change detection, dataGaps[] routing, agentContract for MCP consumers, scoringProfiles (sales/marketing/recruiting), per-watchlist score history — $0.15/lead qualified | Step 3 of this pipeline. Use standalone to score pre-existing lead lists or to set up watchlist monitoring with score-change alerts |
| Phone Number Finder | Finds mobile + direct dial numbers, decides who to call (P1–P4 SLA tier), predicts call outcome | Run downstream of this suite to enrich the discovered contacts with personal phone numbers + dialler-ready routing decisions ($0.10/found) |
| Lead Scoring Engine | Scores existing leads on ICP fit + intent + economics; outputs decision (qualify / nurture / disqualify) | Run downstream to apply ICP-based scoring (different axis from this suite's website-quality grade) |
| Person Enrichment Lookup | Multi-source person enrichment (PDL + heuristics) — fills name/title/email/phone gaps per individual contact | Run downstream when this suite returns named contacts but missing email/phone |
| Bulk Email Verifier | Outbound Control System — verifies email deliverability AND emits routing decisions (send / send-monitor / hold / replace / suppress) plus SLA tier, automation triggers, and deliverability simulation | Run downstream to verify discovered + pattern-generated emails AND get cadence-tool-ready routing primitives in one call |
| HubSpot Lead Pusher | Pushes leads into HubSpot CRM | Auto-create contacts and companies from enriched pipeline output |
| Company Deep Research Agent | Generates comprehensive company intelligence reports | Deep-dive research on your highest-scoring leads |
| Google Maps Lead Enricher | Enriches Google Maps listings with contact data | Combine local business data with this pipeline for local lead gen |
| Website Tech Stack Detector | Identifies frameworks and tools used by a website | Tailor sales pitches based on prospect technology stack |
| WHOIS Domain Lookup | Domain registration, registrar, and expiration data | Verify domain age and ownership for lead qualification |
| Waterfall Contact Enrichment | Multi-source contact enrichment with fallback cascade | Supplement pipeline output with additional contact discovery |
| Lead Enrichment Pipeline | 6-step enrichment: email, phone, verify, company, score, CRM push | Deeper enrichment on leads this suite produces — the full Clay alternative |
| AI Outreach Personalizer | AI-generated personalized cold emails via BYOK OpenAI/Anthropic | Generate outreach copy for scored leads using their company context |
| Intent Signal Tracker | Tracks hiring, tech, funding, and content signals per company | Prioritize which companies to run through this pipeline by buying intent |
| Lead Data Quality Auditor | Scores email, phone, domain, and completeness quality per lead | Audit pipeline output before outreach to filter bad data |