# B2B Lead Intelligence (`ghewaretech/b2b-lead-intelligence`) Actor

Transform company URLs into actionable sales intelligence. Extract contact info, tech stack, social profiles, and business signals
from any company website.

- **URL**: https://apify.com/ghewaretech/b2b-lead-intelligence.md
- **Developed by:** [Unisuraksha Tracking Systems Pvt Ltd](https://apify.com/ghewaretech) (community)
- **Categories:** Lead generation, SEO tools
- **Stats:** 38 total users, 5 monthly users, 92.3% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.50 / 1,000 url-enricheds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## B2B Lead Intelligence

ICP-driven lead intelligence for B2B sales teams. Describe your Ideal Customer Profile, pick the public sources to scan, and the actor will **find** matching companies, **enrich** each with contacts / tech stack / decision-makers / buying-intent signals, and **rank** them by 0-100 fit score — all in one run. No LinkedIn / Apollo / Google scraping; everything is login-free and challenge-compliant.

> **Three things business-development executives actually need: who to contact, why now, and whether the lead is worth your time.** This actor delivers all three for every lead it surfaces.

### How it works

1. **Define your ICP** — industries, size band, tech stack, geo, keywords.
2. **Pick sources** — `yc` (Y Combinator directory), `hn-hiring` (HN "Who is hiring?" thread), `techcrunch` (funding/launch articles), `prnewswire` (corporate press releases), `cb-news` (Crunchbase News).
3. **The actor discovers** matching companies, dedupes across sources, and ranks by relevance.
4. **The actor enriches** each lead — emails, phones, tech stack, social profiles, decision-makers, intent signals.
5. **The actor scores** every lead 0-100 against your ICP and (optionally) drafts 2-3 personalised outreach hooks.

Already have a target list? Skip sourcing and feed URLs directly — the same enrichment + scoring layer applies.

#### Two run modes

| Mode | Per-lead price | What you get | Best for |
|---|---|---|---|
| **`feed`** (fast) | **$0.0005** | Sourced lead + ICP relevance + signal-derived intent (funding amount, hiring signal, launch). No per-site crawl. ~1s per lead. | High-volume daily watchlists, CRM ingestion, dashboards |
| **`enriched`** (deep, default) | **$0.0025** | Everything in feed + full crawl: contacts, tech stack, decision-makers, ICP fit score with reasons, optional LLM outreach hooks. ~30-60s per lead. | Sales-ready records, account-level outreach |

Pick `feed` when you want a steady fire-hose of qualified leads to triage; pick `enriched` when you've narrowed to specific targets and need everything to actually outreach.

### Features

#### Lead sourcing (login-free, no scraping of blocked platforms)
- **`yc`** — Y Combinator companies directory via the public Algolia-backed search. ICP-fit signal via batch + industry filters.
- **`hn-hiring`** — Hacker News "Ask HN: Who is hiring?" monthly thread. Hiring-trigger signal, top-level comments parsed for company + URL + summary.
- **`techcrunch`** — TechCrunch RSS. Funding rounds, launches, M&A, leadership news.
- **`prnewswire`** — PRNewswire RSS. Corporate press releases — funding announcements, leadership appointments, product launches.
- **`cb-news`** — Crunchbase News (the journalism site, *not* the database). Funding & launch triggers parsed from the public RSS feed.

Output rows include a `discovery` block with sources, signal-typed events, `firstSeenAt`, and a 0-100 `relevanceScore` (multi-source overlap + signal-type weighting + ICP keyword/industry hits at sourcing time). Companies surfaced by multiple sources get bumped up.

#### BD intelligence (per lead)
- **Decision-Maker Discovery**: Crawls team / leadership / about pages and returns each person's name, title, role category (founder, executive, sales, marketing, tech, product, finance, people), LinkedIn URL, and profile photo. Uses JSON-LD `Person` schema, personal LinkedIn anchors, and card-block heuristics for breadth.
- **Buying-Intent / Trigger Events**: Surfaces *why now* — recent funding mentions (Series A/B/C, $X raised), hiring surges (count of open roles, departments breakdown), leadership changes ("X joins as CRO"), product launches, and recent press headlines.
- **ICP Fit Score**: 0-100 score with reasons and disqualifiers based on your `idealCustomerProfile` (industries, size band, required/preferred tech, geo, keywords). Intent signals contribute a bonus, so a strong trigger event can lift a borderline-fit account.
- **Outreach Hooks (LLM)**: Optional 2-3 ready-to-send personalisation openers per lead, generated by Apify's built-in LLM (OpenRouter proxy) and grounded in the extracted signals — never invented. Off by default; flip `generateOutreachHooks` to `true` to enable. Billed as Apify platform usage; no third-party API keys needed.

#### Core enrichment
- **Contact Extraction**: Emails, phone numbers, contact form URLs
- **Tech Stack Detection**: CMS, analytics, chat widgets, payment, frameworks (50+ technologies)
- **Social Profiles**: LinkedIn, Twitter, Facebook, GitHub, YouTube, Instagram, Crunchbase
- **Business Signals**: Career pages, pricing, blog activity, customer logos, company size estimation
- **Metadata**: Page titles, descriptions, favicons, OG images

### Use Cases

- **ICP-Driven Prospecting**: Describe your ICP once, run the actor on a schedule, get a fresh ranked list every morning.
- **Trigger-Event Outreach**: Surface companies with fresh funding, new VPs, or hiring surges so you reach out *while the buying window is open*.
- **CRM Enrichment**: Pass an existing target list (URLs) and auto-populate company + key-people data in HubSpot, Salesforce, etc.
- **Competitive Analysis**: Understand competitor tech stacks and business signals.
- **Account-Based Marketing**: Identify decision-makers per account and generate per-account personalisation hooks at scale.

### How It Compares

Quick read on where this actor fits in the B2B lead-intelligence landscape.

#### vs Apollo.io
Apollo's strength is a massive seeded contact database; the trade-off is a per-contact subscription ($0.10–$0.30/lead), credit caps that bite when you scale, and a sourcing model that scrapes platforms (LinkedIn, Google) the Apify $1M Challenge explicitly excludes. **B2B Lead Intelligence** uses only login-free public sources (YC, HN hiring, Crunchbase News), charges per-event ($0.0025/lead — 40-120× cheaper), and adds *trigger-event* signals (funding, hiring surges, leadership changes) that Apollo treats as a separate paid module.

#### vs Clearbit (now HubSpot Breeze Intelligence)
Clearbit's API enriches a domain you already have. It does *not* discover new accounts for you. **B2B Lead Intelligence** does both — ICP-driven discovery *and* enrichment — in a single run, and lets you bring your own URL list when you don't need sourcing. At $0.0025/lead vs Clearbit's $0.36/lookup, you can run a full account list weekly for the cost of a single Clearbit query.

#### vs Lusha
Lusha leans heavily on personal contact data (email/phone for individuals at target accounts) — useful, but legally fraught in many jurisdictions and increasingly throttled. **B2B Lead Intelligence** focuses on company-level intelligence + decision-maker discovery from publicly available pages (team / leadership / about), so the data is grounded and reproducible. If Lusha is your contact-finder, this actor is your account-prioritiser.

#### vs Hunter.io
Hunter is purpose-built for finding email addresses by domain. It's narrower in scope. **B2B Lead Intelligence** also extracts emails — and adds tech stack, decision-makers, intent signals, and ICP scoring on top. Use Hunter when you need verified email discovery at depth; use this when you need to *qualify* the account first.

#### When this actor isn't the right fit
- You need verified personal phone numbers — Lusha or ZoomInfo serve that better.
- You need a contact-level credit-based API matching individual records — Apollo / Clearbit are stronger there.
- You need ten-million-row enterprise database licences — this is a per-run actor, not a data warehouse.

### Input

The primary mode is **ICP + sourcing** (let the actor find your leads):

```json
{
  "idealCustomerProfile": {
    "industries": ["SaaS", "B2B"],
    "sizeMin": 11,
    "sizeMax": 200,
    "preferredTech": ["HubSpot", "Stripe"],
    "keywords": ["sales", "automation"]
  },
  "sourcing": {
    "sources": ["yc", "hn-hiring", "cb-news"],
    "maxResults": 20,
    "recencyDays": 30,
    "triggerEventTypes": ["funding", "hiring", "leadership", "launch"]
  },
  "extractKeyPeople": true,
  "detectIntentSignals": true,
  "generateOutreachHooks": true
}
````

Returns ~20 enriched leads ranked by ICP relevance, each with decision-makers, intent signals, fit score, and outreach hooks.

#### Alternative: enrich a specific URL list

When you already have target accounts (e.g. existing CRM):

```json
{
  "urls": ["https://stripe.com", "https://hubspot.com"],
  "idealCustomerProfile": {
    "industries": ["SaaS"],
    "sizeMin": 50
  },
  "extractKeyPeople": true,
  "detectIntentSignals": true
}
```

#### Input Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `mode` | string | `enriched` | `feed` for fast/cheap sourcing-only output, `enriched` for full crawl + extraction |
| `idealCustomerProfile` | object | - | Your ICP. Drives sourcing relevance and per-lead 0-100 fit-score. Fields: `industries[]`, `sizeMin`, `sizeMax`, `requiredTech[]`, `preferredTech[]`, `geo[]`, `keywords[]` |
| `sourcing` | object | - | Sourcing config. Fields: `sources` (`'yc'\|'hn-hiring'\|'techcrunch'\|'prnewswire'\|'cb-news'[]`), `maxResults`, `recencyDays`, `triggerEventTypes` (`'funding'\|'hiring'\|'launch'\|'leadership'\|'directory'[]`), `industries[]`, `keywords[]`. The actor discovers leads first, then enriches them. |
| `urls` | array | - | Optional direct URL list. Use when you already have target accounts. Combinable with sourcing. Required *only* when `sourcing.sources` is empty. |
| `extractKeyPeople` | boolean | true | Extract decision-makers from team / leadership pages |
| `detectIntentSignals` | boolean | true | Detect funding, hiring surges, leadership changes, product launches |
| `generateOutreachHooks` | boolean | false | Generate 2-3 LLM-written outreach openers per lead (uses Apify built-in LLM) |
| `extractEmails` | boolean | true | Extract email addresses |
| `extractPhones` | boolean | true | Extract phone numbers |
| `detectTechStack` | boolean | true | Detect technologies used |
| `includeSocialProfiles` | boolean | true | Extract social media links |
| `detectBusinessSignals` | boolean | true | Detect business signals |
| `maxPagesPerDomain` | integer | 10 | Maximum pages to crawl per company (1-50) |
| `proxyConfiguration` | object | - | Proxy settings (recommended for large-scale) |

### Output

```json
{
  "inputUrl": "https://apify.com",
  "companyUrl": "https://apify.com",
  "companyName": "Apify",
  "description": "Apify is a platform for web scraping and automation.",
  "contact": {
    "emails": ["hello@apify.com", "support@apify.com"],
    "phones": ["+1-555-123-4567"],
    "contactFormUrl": "https://apify.com/contact"
  },
  "socialProfiles": {
    "linkedin": "https://linkedin.com/company/apify",
    "twitter": "https://twitter.com/apaborsky",
    "facebook": null,
    "youtube": "https://youtube.com/c/apify",
    "github": "https://github.com/apify",
    "instagram": null,
    "crunchbase": "https://crunchbase.com/organization/apify"
  },
  "techStack": {
    "cms": null,
    "analytics": ["Google Analytics", "Mixpanel"],
    "chat": "Intercom",
    "payment": ["Stripe"],
    "hosting": "AWS",
    "frameworks": ["React", "Next.js"]
  },
  "businessSignals": {
    "hasCareerPage": true,
    "hasBlog": true,
    "hasPricingPage": true,
    "hasContactPage": true,
    "hasAboutPage": true,
    "hasCustomerLogos": true,
    "estimatedSize": "51-200"
  },
  "keyPeople": [
    {
      "name": "Jan Curn",
      "title": "Founder & CEO",
      "category": "founder",
      "linkedinUrl": "https://linkedin.com/in/jancurn",
      "profileImageUrl": "https://apify.com/img/team/jan.jpg",
      "sourceUrl": "https://apify.com/about"
    }
  ],
  "intentSignals": {
    "recentFundingMention": {
      "text": "Series A: raised $25M led by Insight Partners",
      "amount": "$25M",
      "round": "series a",
      "sourceUrl": "https://apify.com/blog/series-a"
    },
    "hiringSurge": { "openRoles": 18, "departments": ["engineering", "sales", "marketing"] },
    "leadershipChange": null,
    "productLaunch": { "title": "Introducing the Apify Agentic Browser", "sourceUrl": "https://apify.com/blog/agentic-browser" },
    "recentPressItems": ["Apify named to Forbes Cloud 100", "Apify launches MCP server"]
  },
  "fitScore": {
    "score": 84,
    "reasons": ["Estimated size 51-200 fits target range", "Funding signal: Series A: raised $25M", "Hiring surge: 18 open roles in engineering, sales, marketing"],
    "disqualifiers": []
  },
  "outreachHooks": [
    "Saw the Series A close — congrats; curious how the new $25M is shaping the GTM hire plan.",
    "Noticed you're spinning up an SDR team — happy to share what's working with our other Apify-stack users."
  ],
  "metadata": {
    "title": "Apify - Web Scraping and Automation Platform",
    "metaDescription": "Build and run web scrapers, data pipelines...",
    "ogImage": "https://apify.com/og-image.png",
    "favicon": "https://apify.com/favicon.ico",
    "language": "en"
  },
  "crawlStats": {
    "pagesCrawled": 10,
    "crawlDurationMs": 15000,
    "timestamp": "2025-01-15T10:30:00Z"
  }
}
```

### Technologies Detected

#### CMS / Platforms

WordPress, Shopify, Webflow, Wix, Squarespace, Drupal, Joomla, Ghost, HubSpot, Contentful

#### Analytics

Google Analytics, Google Tag Manager, Mixpanel, Amplitude, Segment, Hotjar, FullStory, Heap, Plausible, Fathom

#### Chat / Support

Intercom, Drift, Zendesk, Freshdesk, Crisp, Tawk.to, HubSpot Chat, LiveChat

#### Payment

Stripe, PayPal, Braintree, Square, Shopify Payments

#### Frameworks

React, Vue, Angular, Next.js, Nuxt.js, Svelte, jQuery, Bootstrap, Tailwind

#### Hosting / CDN

AWS, Cloudflare, Vercel, Netlify, Heroku, Google Cloud, Azure, DigitalOcean, Fastly

### API Usage

```javascript
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('ghewaretech/b2b-lead-intelligence').call({
  idealCustomerProfile: {
    industries: ['SaaS', 'B2B'],
    sizeMin: 11,
    sizeMax: 200,
    keywords: ['sales', 'automation'],
  },
  sourcing: {
    sources: ['yc', 'hn-hiring', 'cb-news'],
    maxResults: 20,
    recencyDays: 30,
  },
  generateOutreachHooks: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);
```

### Performance

- **Speed**: ~10-30 seconds per lead (depending on site size and maxPagesPerDomain)
- **Accuracy**: 90%+ on core fields
- **Recommended**: Use proxy configuration for large-scale runs

### Pricing

**Simple, transparent pricing: $2.50 per 1,000 leads ($0.0025 per lead — sourced or directly-input)**

#### How It Compares

| Service | Price per Lead | B2B Lead Intelligence |
|---------|----------------|------------------------|
| Clearbit | $0.36 | **144x cheaper** |
| Lusha | $0.31 | **124x cheaper** |
| Apollo.io | $0.10-0.30 | **40-120x cheaper** |
| Hunter.io | $0.05 | **20x cheaper** |
| **This Actor** | **$0.0025** | ✓ |

#### Cost Examples

| Leads | Total Cost |
|-------|------------|
| 100 | $0.25 |
| 500 | $1.25 |
| 1,000 | $2.50 |
| 10,000 | $25.00 |
| 100,000 | $250.00 |

#### What You Get for $0.0025 per Lead

- ICP-driven sourcing from public directories (no LinkedIn / Apollo / Google scraping)
- Contact info (emails, phones, contact forms)
- Tech stack detection (50+ technologies)
- Social profiles (LinkedIn, Twitter, Facebook, YouTube, GitHub, Instagram, Crunchbase)
- Business signals (career page, blog, pricing, company size)
- Decision-makers with role categorisation + LinkedIn URLs
- Buying-intent signals (funding, hiring surge, leadership change, launches)
- 0-100 ICP fit score with reasons + disqualifiers
- Optional LLM-generated outreach hooks
- Up to 10 pages crawled per domain (configurable)

#### Free to Try

Test with a small `maxResults` to see the quality before running at scale. Pay-per-event billing means you only pay for what you use.

### Limitations

- Does not extract data requiring login
- Some websites may block crawlers (use proxy configuration)
- Email/phone extraction depends on public visibility on website
- Tech stack detection based on client-side patterns only

### Development

For contributors and forks:

```bash
npm install
npx playwright install chromium     ## first run only
npm run start:dev                   ## run locally with tsx
npm run build                       ## tsc → dist/
npm test                            ## offline regression pack
npm run test:integration            ## live HTTP smokes (HN / CB News / YC)
npm run typecheck:test              ## tsc on test sources
```

The regression pack (`test/`) covers the parser logic for every external data shape (HN comments, Crunchbase RSS, YC Algolia), the source orchestrator's dedup / scoring / failure-isolation contract, the ICP fit-score, the BD intel extractors, and the Apify input/dataset schemas. Run before every `apify push`.

### Changelog

#### 2026-04-30 — Regression pack & key-people fix

- Vitest regression suite (~74 tests, 8 files) covering parsers, source orchestrator, ICP fit-score, key-people / intent-signal extractors, and Apify input/dataset schemas. Live-network smokes for HN / Crunchbase News / YC are gated behind `RUN_INTEGRATION=1`.
- Bug fix: `extractKeyPeople` no longer mis-parses JSON-LD `jobTitle` as a person's name (`<script>` / `<style>` blocks are stripped before the LinkedIn-anchor and card-block heuristics run).
- Refactor: input-gate validation extracted to `validate-input.ts` and unit-tested.

#### 2026-04-30 — Lead sourcing & BD intelligence

- **Lead sourcing** added: `yc`, `hn-hiring`, `cb-news` modules plus orchestrator with cross-source dedup, signal-typed relevance scoring, and ICP-driven ranking.
- **BD intelligence** added: decision-maker extraction (JSON-LD + LinkedIn-anchor + card-block heuristics), buying-intent signals (funding / hiring surge / leadership change / product launch / press), 0-100 ICP fit score with reasons + disqualifiers, optional LLM-generated outreach hooks via Apify's built-in OpenRouter Standby actor.
- Single-actor design: `sourcing` block triggers discovery, then enrichment runs against the discovered URLs in the same run; sourced rows carry a `discovery` block with sources/signals/relevance.
- Bug fix: HN Algolia HTML-encodes URL slashes as `&#x2F;`; URL extraction now decodes entities before regex matching.
- Bug fix: `ProxyConfiguration` now constructed via `Actor.createProxyConfiguration()` (Crawlee's class rejected the Apify input shape).

#### Initial release

- Contact extraction (emails, phones, forms)
- Tech stack detection (50+ technologies)
- Social profile extraction
- Business signals detection
- Company size estimation

### Support

- **GitHub**: [brainupgrade-in](https://github.com/brainupgrade-in)
- **LinkedIn**: [company/brains-upgrade](https://linkedin.com/company/brains-upgrade)
- **YouTube**: [@GhewareDevOpsAI](https://youtube.com/@GhewareDevOpsAI)

***

Built by [Gheware](https://gheware.com) for the [Apify $1M Challenge](https://apify.com/million-dollar-challenge)

### License

ISC

# Actor input Schema

## `mode` (type: `string`):

How much work to do per lead. 'feed' = fast trigger feed: source the leads, attach signals + ICP relevance, no per-site crawl. 5x cheaper, ~30s for 20 leads, thinner data. 'enriched' = full crawl: contacts, tech stack, decision-makers, intent signals — everything.

## `idealCustomerProfile` (type: `object`):

Describe your target customer. Drives both sourcing relevance (which discovered companies surface to the top) and the per-lead 0-100 fit score with reasons / disqualifiers. Leave empty only if you are running URL-only enrichment without scoring.

## `sourcing` (type: `object`):

Pick the public sources to scan. The actor discovers companies matching your ICP, dedupes across sources, ranks by relevance (signal-type weighting + multi-source overlap + ICP keyword/industry hits), then enriches the top N in the same run.

## `urls` (type: `array`):

Skip the sourcing block and feed URLs directly when you already have a target list (e.g. existing CRM accounts you want enriched + scored). Combinable with sourcing — both sets get enriched in the same run.

## `extractKeyPeople` (type: `boolean`):

Crawl team / leadership / about pages and extract decision-makers (name, title, role category, LinkedIn URL). Categories: founder, executive, sales, marketing, tech, product, finance, people.

## `detectIntentSignals` (type: `boolean`):

Surface trigger events worth reaching out about: recent funding mentions, hiring surges (open roles by department), leadership changes, product launches, and recent press headlines.

## `generateOutreachHooks` (type: `boolean`):

Use Apify's built-in LLM (OpenRouter proxy) to generate 2-3 ready-to-send personalisation hooks per lead, grounded in the extracted signals. Adds an LLM cost (billed as Apify platform usage). Off by default.

## `extractEmails` (type: `boolean`):

Find and extract business email addresses (info@, sales@, contact@, etc.). Filters out generic emails like noreply@.

## `extractPhones` (type: `boolean`):

Find and extract phone numbers in various formats (US, international, formatted).

## `detectTechStack` (type: `boolean`):

Identify 50+ technologies including CMS (WordPress, Shopify), analytics (GA, Mixpanel), chat widgets (Intercom, Drift), payment (Stripe), and frameworks (React, Vue).

## `includeSocialProfiles` (type: `boolean`):

Extract links to social media profiles: LinkedIn, Twitter/X, Facebook, YouTube, GitHub, Instagram, Crunchbase.

## `detectBusinessSignals` (type: `boolean`):

Identify business signals: careers page (hiring), pricing page, blog activity, customer logos, and estimate company size (1-10, 11-50, 51-200, 200+).

## `maxPagesPerDomain` (type: `integer`):

Maximum pages to crawl per company. More pages = better decision-maker and intent-signal coverage but longer runtime and higher cost. Recommended: 10 for basic enrichment, 20+ for full BD analysis.

## `proxyConfiguration` (type: `object`):

Use Apify Proxy to avoid being blocked by websites. Highly recommended for large-scale enrichment or sites with bot protection.

## Actor input object example

```json
{
  "mode": "enriched",
  "idealCustomerProfile": {
    "industries": [
      "SaaS",
      "B2B"
    ],
    "sizeMin": 11,
    "sizeMax": 200,
    "preferredTech": [
      "HubSpot",
      "Stripe"
    ],
    "keywords": [
      "sales",
      "marketing",
      "automation"
    ]
  },
  "sourcing": {
    "sources": [
      "yc",
      "hn-hiring",
      "techcrunch",
      "prnewswire",
      "cb-news"
    ],
    "maxResults": 20,
    "recencyDays": 30,
    "triggerEventTypes": [
      "funding",
      "hiring",
      "leadership",
      "launch"
    ]
  },
  "extractKeyPeople": true,
  "detectIntentSignals": true,
  "generateOutreachHooks": false,
  "extractEmails": true,
  "extractPhones": true,
  "detectTechStack": true,
  "includeSocialProfiles": true,
  "detectBusinessSignals": true,
  "maxPagesPerDomain": 10,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `enrichedLeads` (type: `string`):

Complete company intelligence data including emails, phones, tech stack (50+ technologies), social profiles, and business signals.

## `csvExport` (type: `string`):

Download results as CSV for spreadsheet import.

## `excelExport` (type: `string`):

Download results as Excel file.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "idealCustomerProfile": {
        "industries": [
            "SaaS",
            "B2B"
        ],
        "sizeMin": 11,
        "sizeMax": 200,
        "preferredTech": [
            "HubSpot",
            "Stripe"
        ],
        "keywords": [
            "sales",
            "marketing",
            "automation"
        ]
    },
    "sourcing": {
        "sources": [
            "yc",
            "hn-hiring",
            "techcrunch",
            "prnewswire",
            "cb-news"
        ],
        "maxResults": 20,
        "recencyDays": 30,
        "triggerEventTypes": [
            "funding",
            "hiring",
            "leadership",
            "launch"
        ]
    },
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("ghewaretech/b2b-lead-intelligence").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "idealCustomerProfile": {
        "industries": [
            "SaaS",
            "B2B",
        ],
        "sizeMin": 11,
        "sizeMax": 200,
        "preferredTech": [
            "HubSpot",
            "Stripe",
        ],
        "keywords": [
            "sales",
            "marketing",
            "automation",
        ],
    },
    "sourcing": {
        "sources": [
            "yc",
            "hn-hiring",
            "techcrunch",
            "prnewswire",
            "cb-news",
        ],
        "maxResults": 20,
        "recencyDays": 30,
        "triggerEventTypes": [
            "funding",
            "hiring",
            "leadership",
            "launch",
        ],
    },
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("ghewaretech/b2b-lead-intelligence").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "idealCustomerProfile": {
    "industries": [
      "SaaS",
      "B2B"
    ],
    "sizeMin": 11,
    "sizeMax": 200,
    "preferredTech": [
      "HubSpot",
      "Stripe"
    ],
    "keywords": [
      "sales",
      "marketing",
      "automation"
    ]
  },
  "sourcing": {
    "sources": [
      "yc",
      "hn-hiring",
      "techcrunch",
      "prnewswire",
      "cb-news"
    ],
    "maxResults": 20,
    "recencyDays": 30,
    "triggerEventTypes": [
      "funding",
      "hiring",
      "leadership",
      "launch"
    ]
  },
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call ghewaretech/b2b-lead-intelligence --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=ghewaretech/b2b-lead-intelligence",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "B2B Lead Intelligence",
        "description": "Transform company URLs into actionable sales intelligence. Extract contact info, tech stack, social profiles, and business signals\n  from any company website.",
        "version": "1.0",
        "x-build-id": "WrBTd0TtNord5svEt"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/ghewaretech~b2b-lead-intelligence/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-ghewaretech-b2b-lead-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/ghewaretech~b2b-lead-intelligence/runs": {
            "post": {
                "operationId": "runs-sync-ghewaretech-b2b-lead-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/ghewaretech~b2b-lead-intelligence/run-sync": {
            "post": {
                "operationId": "run-sync-ghewaretech-b2b-lead-intelligence",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "mode": {
                        "title": "Run Mode",
                        "enum": [
                            "feed",
                            "enriched"
                        ],
                        "type": "string",
                        "description": "How much work to do per lead. 'feed' = fast trigger feed: source the leads, attach signals + ICP relevance, no per-site crawl. 5x cheaper, ~30s for 20 leads, thinner data. 'enriched' = full crawl: contacts, tech stack, decision-makers, intent signals — everything.",
                        "default": "enriched"
                    },
                    "idealCustomerProfile": {
                        "title": "Ideal Customer Profile (ICP)",
                        "type": "object",
                        "description": "Describe your target customer. Drives both sourcing relevance (which discovered companies surface to the top) and the per-lead 0-100 fit score with reasons / disqualifiers. Leave empty only if you are running URL-only enrichment without scoring."
                    },
                    "sourcing": {
                        "title": "Lead Sourcing",
                        "type": "object",
                        "description": "Pick the public sources to scan. The actor discovers companies matching your ICP, dedupes across sources, ranks by relevance (signal-type weighting + multi-source overlap + ICP keyword/industry hits), then enriches the top N in the same run."
                    },
                    "urls": {
                        "title": "Direct Company URLs (optional)",
                        "type": "array",
                        "description": "Skip the sourcing block and feed URLs directly when you already have a target list (e.g. existing CRM accounts you want enriched + scored). Combinable with sourcing — both sets get enriched in the same run.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "extractKeyPeople": {
                        "title": "Extract Decision Makers",
                        "type": "boolean",
                        "description": "Crawl team / leadership / about pages and extract decision-makers (name, title, role category, LinkedIn URL). Categories: founder, executive, sales, marketing, tech, product, finance, people.",
                        "default": true
                    },
                    "detectIntentSignals": {
                        "title": "Detect Buying-Intent Signals",
                        "type": "boolean",
                        "description": "Surface trigger events worth reaching out about: recent funding mentions, hiring surges (open roles by department), leadership changes, product launches, and recent press headlines.",
                        "default": true
                    },
                    "generateOutreachHooks": {
                        "title": "Generate Outreach Hooks (LLM)",
                        "type": "boolean",
                        "description": "Use Apify's built-in LLM (OpenRouter proxy) to generate 2-3 ready-to-send personalisation hooks per lead, grounded in the extracted signals. Adds an LLM cost (billed as Apify platform usage). Off by default.",
                        "default": false
                    },
                    "extractEmails": {
                        "title": "Extract Emails",
                        "type": "boolean",
                        "description": "Find and extract business email addresses (info@, sales@, contact@, etc.). Filters out generic emails like noreply@.",
                        "default": true
                    },
                    "extractPhones": {
                        "title": "Extract Phone Numbers",
                        "type": "boolean",
                        "description": "Find and extract phone numbers in various formats (US, international, formatted).",
                        "default": true
                    },
                    "detectTechStack": {
                        "title": "Detect Tech Stack",
                        "type": "boolean",
                        "description": "Identify 50+ technologies including CMS (WordPress, Shopify), analytics (GA, Mixpanel), chat widgets (Intercom, Drift), payment (Stripe), and frameworks (React, Vue).",
                        "default": true
                    },
                    "includeSocialProfiles": {
                        "title": "Include Social Profiles",
                        "type": "boolean",
                        "description": "Extract links to social media profiles: LinkedIn, Twitter/X, Facebook, YouTube, GitHub, Instagram, Crunchbase.",
                        "default": true
                    },
                    "detectBusinessSignals": {
                        "title": "Detect Business Signals",
                        "type": "boolean",
                        "description": "Identify business signals: careers page (hiring), pricing page, blog activity, customer logos, and estimate company size (1-10, 11-50, 51-200, 200+).",
                        "default": true
                    },
                    "maxPagesPerDomain": {
                        "title": "Max Pages Per Domain",
                        "minimum": 1,
                        "maximum": 50,
                        "type": "integer",
                        "description": "Maximum pages to crawl per company. More pages = better decision-maker and intent-signal coverage but longer runtime and higher cost. Recommended: 10 for basic enrichment, 20+ for full BD analysis.",
                        "default": 10
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Use Apify Proxy to avoid being blocked by websites. Highly recommended for large-scale enrichment or sites with bot protection."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
