# Scout — Lead Enrichment + OSINT (`logical_vivacity/scout`) Actor

Email finder + lead enrichment + OSINT from public sources. Pass any fragment — name, email, or domain — get a verified dossier: 700+ identity sites, SMTP-validated emails, document mining, sanctions screen, domain→team discovery. $0.05 person, $0.15 domain. No API keys

- **URL**: https://apify.com/logical\_vivacity/scout.md
- **Developed by:** [Logical Vivacity](https://apify.com/logical_vivacity) (community)
- **Categories:** Lead generation, AI, News
- **Stats:** 5 total users, 4 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $50.00 / 1,000 person enricheds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Scout - OSINT + Lead Enrichment from Public Sources

> **Send Scout out with any fragment - a name, an email, a domain, a handle - and it brings back a verified, audit-ready dossier: cross-platform identity, work emails, employer firmographics, document evidence, and the org's actual team. No customer API keys. No per-seat pricing. Pay only per result.**

Scout is a single Apify actor that spans the gap between an OSINT tool and a lead-enrichment platform. From a single field of input it returns a structured JSON dossier: verified identity, social presence across 700+ platforms, work-email waterfall, public document evidence, sanctions screening, full company firmographics - and when you scout a domain, the org's actual people, each spawned as a related entity and enriched in the same run.

**Pricing:** $0.05 per person, $0.15 per domain/org, $0.02 per spawned team member. Compute on top (Apify default). No subscriptions, no API keys, no surprises.

---

### What's in the box

- 🔍 **Org → People discovery.** Pass `domain: "acme.com"` and Scout walks the team / about / leadership pages, NER-extracts each person, pairs them with their title, and **spawns a Person entity per teammate** - then runs the full person-enrichment pipeline on each (email finder, GitHub, LinkedIn, Gravatar, …). One domain input → fully populated team graph.
- 📧 **Email finder waterfall.** Given `(name, domain)` Scout generates plausible local-parts (`first.last`, `flast`, `firstl`, …), runs SMTP RCPT against each, returns the first verified hit with full provenance.
- 📄 **Document mining.** Filename signals (date / kind / subject), embedded PDF/DOCX metadata (Author / Producer / Created), OCR fallback for image-only PDFs, NER-based co-occurrence with proximity scoring (closer name = stronger relation), section-aware resume parsing → structured `WorkExperience` and `EducationEntry` entries.
- 🔗 **High-signal individual systems.** Booking links (Calendly, cal.com, SavvyCal, Hubspot Meetings, …). Newsletter detection (Substack, Beehiiv, Buttondown, Kit, Ghost). Cert Transparency for subdomain history.
- 🥷 **Stealth fetch.** httpx-first with smart browser fallback for anti-bot hosts, `playwright-stealth` patches, randomised viewport, Google Referer header, and a curated list of "go straight to browser" hosts (LinkedIn, Twitter, Glassdoor, Crunchbase, …).
- 🧹 **Field normalization** at the model level. Phones canonicalise to E.164, dates to ISO, URLs lose `www.` / fragments / trailing slashes, handles drop leading `@`, emails lowercase, LinkedIn URLs canonicalise to `https://<host>/in/<slug>`.
- ⚡ **Single-entity quick mode.** No more "wrap in `entities: [...]`" - just paste `full_name` or `email` or `domain` directly into the input.
- 📜 **Per-component provenance.** Every component carries `_added_by`, `_added_at`, `_confidence`, `_sources`, `_evidence` - your compliance team can audit every datum.

---

### Two ways to brief Scout

#### 1) Quick mode (typed inputs, single entity)

Just fill the fields you have. No JSON wrapping required.

```json
{
  "full_name": "Jane Doe",
  "email": "jane@acme.com"
}
````

```json
{
  "domain": "acme.com"
}
```

```json
{
  "linkedin_url": "https://www.linkedin.com/in/jane-doe"
}
```

#### 2) Bulk mode (entities array)

For multi-lead runs.

```json
{
  "entities": [
    { "kind": "person", "full_name": "Jane Doe", "email": "jane@acme.com" },
    { "kind": "domain", "domain": "example.com" },
    { "kind": "organization", "name": "Acme Inc", "domain": "acme.com" }
  ]
}
```

Scout infers `kind` from whatever fields you pass - set it explicitly only when you want to override the heuristic.

***

### Input field reference

All fields are optional individually; pass at least one.

#### Person fields

| Field | Type | Notes |
|---|---|---|
| `full_name` | string | Most useful single-anchor input |
| `first_name` / `last_name` | string | Use if you only have parts |
| `email` | string | Auto-validated (syntax, MX, SMTP, breach) |
| `phone` | string | Any format → E.164 |
| `company_name` | string | Combined with `full_name` enables email finder |
| `title` | string | Job title / role |
| `location` | string | City / country - for disambiguation |
| `linkedin_url` | string | Auto-canonicalised |
| `github_username` | string | Stripped of `@`, lowercased |
| `twitter_handle` | string | Stripped of `@`, lowercased |
| `bluesky_handle` / `mastodon_handle` | string | |
| `notes` | string | Free-form context attached to the entity |

#### Organization / domain fields

| Field | Type | Notes |
|---|---|---|
| `domain` | string | Triggers the full org pipeline + team discovery |
| `name` / `company_name` | string | Org / company name |

#### Run controls

| Field | Type | Default | Notes |
|---|---|---|---|
| `kind` | enum | auto | `person` / `organization` / `domain` / `email` / `phone` / blank |
| `processors` | array | all | Subset of enrichment systems (advanced) |
| `proxyConfiguration` | proxy | Apify proxy on | Strongly recommend keeping enabled |
| `headless` | boolean | true | Browser headless |
| `perLeadTimeoutSeconds` | int | 180 | Total runtime ceiling per top-level entity |

***

### What Scout brings back per run

#### Identity & verification

- Verified `FullName`, `Email`, `Phone` (E.164), `Location`, primary social handles
- Cross-source `Confirmed` components per field with which sources agreed
- `Conflict` components surfacing values Scout *didn't* commit to
- `SelfGrade` - per-field confidence + an overall `identity_locked` flag + rationale
- `EntityTypeClassification` - developer / executive / academic / creator / marketer / unknown
- `QualityGate` with input-ambiguity score and suggested extra inputs

#### Email

- RFC syntax + deliverability + MX + SMTP RCPT
- Disposable + public-mailbox detection
- Domain-level breach exposure (HaveIBeenPwned)
- `EmailPattern` detection (`first.last@`, `flast@`, …)
- **Email finder** - generates `name@domain` candidates and SMTP-verifies them when you have a name + an org domain anchor

#### Social & public presence

- Bluesky, Mastodon, Reddit, Stack Overflow, Dev.to, Medium, Hacker News, Wikipedia/Wikidata
- Keybase + PGP keyservers
- Cross-platform handle map: probes ~700 sites via WhatsMyName (GitHub, GitLab, npm, PyPI, Hugging Face, Kaggle, Behance, Twitch, YouTube, Substack, Steam, ProductHunt, Strava, Letterboxd, …)
- Per-platform false-positive filtering against the lead's known name + location
- `BookingLink` - Calendly / cal.com / SavvyCal / Tidycal / Hubspot Meetings / Acuity / Bookings / Doodle / YCBM
- `Newsletter` - Substack / Beehiiv / Buttondown / Kit / Ghost / Revue

#### Document evidence

- Resume / CV / slide-deck discovery via dorked search
- PDF / DOCX text extraction + emails / phones / URLs / hyperlinks
- Embedded doc metadata: `doc_title`, `doc_author`, `doc_subject`, `doc_producer`, `doc_creator`, `doc_created`, `doc_modified`
- **Filename parser** - extracts `parsed_date`, `kind_hint` (resume / cv / deck / report / invoice / contract / thesis / photo / …), `subject_hint`
- **OCR fallback** - image-only PDFs retried via `tesseract` + `pdf2image`
- **Section-aware resume parser** - emits structured `WorkExperience` (title, company, location, start/end dates, bullets) and `EducationEntry` (institution, degree, field, dates)
- **NER co-occurrence** - extracts `PERSON` / `ORG` / `GPE` (location) entities from doc text with **proximity scoring** vs the primary person's name (closer = more related)
- **MentionedWith** relation between people who appear together in the same document

#### GitHub depth (developer leads)

- Profile + organisation stats
- Public repos, top languages, most-starred repo
- Recent activity heatmap, timezone inference, most-active hour/day
- Repo README mining for emails
- Commit-author email extraction (skips `users.noreply.github.com`)
- Co-author / collaborator graph
- Package ownership across npm, PyPI, crates.io

#### Compliance & risk

- OFAC SDN sanctions screen (free public source)
- PEP / adverse-news search
- Domain-level breach history

#### Company / domain side

- WHOIS - registrar, dates, registrant org
- DNS - A, MX, NS, TXT, SPF, DMARC
- Hosting - ASN, ASN organisation, country
- MX provider detection - Google Workspace / Microsoft 365 / Zoho / self-hosted
- Tech-stack fingerprint via Wappalyzer
- CDN detection (Cloudflare, Fastly, Akamai, Vercel, Netlify, …)
- **Cert Transparency** subdomain history + cert-issuance emails (crt.sh)
- Subdomain enumeration
- Status pages, public API docs
- ProductHunt history, Wayback Machine timeline
- LinkedIn company page (auth-walled fields are skipped, not faked)
- ATS / hiring signals - Greenhouse, Lever, Ashby, Workable
- Crunchbase, Glassdoor, BuiltWith
- **OrgPeople** - scrapes team / about / leadership pages, spawns a Person entity per teammate, enriches each

***

### Single-input demo flows

#### Just a name

```json
{ "full_name": "Jane Doe" }
```

→ Scout anchors via dork search + identifier harvest → finds GitHub, LinkedIn, Twitter, personal domain → mines personal site for emails/handles → cross-corroborates → emits `Confirmed` on the agreed fields.

#### Just a domain

```json
{ "domain": "acme.com" }
```

→ Full domain enrichment (WHOIS, DNS, hosting, tech stack, cert history) → scrapes `/about` (browser + networkidle so SPA content renders) → spawns a Person entity per teammate found on the team page → runs email finder + GitHub + LinkedIn + Gravatar on each → returns one org with N enriched related Persons.

#### Email + company

```json
{ "email": "founder@acme.com", "company_name": "Acme Inc" }
```

→ Validates the email, mines breach data, checks SMTP → mines pattern → runs domain enrichment → infers WorksFor → completes person profile from socials.

***

### Pricing

Pay only for results Scout actually delivers. No monthly fee, no per-seat licence, no API keys to manage.

| Event | When it fires | Price |
|---|---|---|
| `person_enriched` | One Person / Email / Phone primary entity comes back enriched | **$0.05** |
| `domain_enriched` | One Domain / Organization primary entity comes back enriched | **$0.15** |
| `team_member_spawned` | A teammate found via org→people discovery and enriched as a related Person | **$0.02** each |

A typical run:

- `{full_name: "Jane Doe"}` → **$0.05** (single person)
- `{email: "founder@acme.com"}` → **$0.05** (email expanded into person)
- `{domain: "acme.com"}` (returns 8 teammates) → **$0.15 + 8 × $0.02 = $0.31**

Scout does **not** bill when the result is too thin to be useful (`SelfGrade.overall_confidence < 0.2` and identity not locked) - you only pay for runs that actually delivered something.

Apify charges its standard compute fee on top (typically a few cents per run depending on memory + duration).

***

### Use cases

- **Sales / lead enrichment** - turn a sparse CRM record into a usable contact + employer dossier; or expand a single domain into a team prospect list with verified emails.
- **OSINT / due diligence** - investigate a counterparty, verify identity across multiple independent sources, catch sanctions / adverse-news flags.
- **Recruiting** - enrich a candidate's full public footprint, structured employment history from their resume, signals across 700+ sites.
- **CRM hygiene** - re-enrich existing leads, surface stale or wrong-person records, dedupe via cross-source identity verification.
- **Investigative reporting** - public-source person lookup with per-field provenance.
- **Company research** - go from a domain to a fully enriched org + team, including booking links and newsletter URLs for outreach.

***

### Output shape

Each input entity becomes one dataset record:

```json
{
  "primary": {
    "kind": "person",
    "id": "...",
    "FullName": { "value": "Jane Doe", "_added_by": "input", "_added_at": "..." },
    "Email": { "address": "jane@acme.com", "_confidence": 0.85, "_sources": ["email_finder"] },
    "GitHubProfile": { "username": "janedoe", "followers": 142, "..." : "..." },
    "LinkedInUrl": { "value": "https://www.linkedin.com/in/jane-doe" },
    "WorkExperience": [
      { "title": "Senior Software Engineer", "company": "Acme Inc", "start_date": "2022-01", "end_date": "Present", "bullets": [...] }
    ],
    "EducationEntry": [
      { "institution": "Stanford University", "degree": "MSc", "field": "Computer Science", "start_date": "2018", "end_date": "2020" }
    ],
    "BookingLink": [{ "provider": "calendly", "url": "https://calendly.com/jane-doe" }],
    "DocEntityMention": [
      { "entity_type": "person", "value": "John Smith", "proximity_score": 0.97, "min_distance_chars": 33 }
    ],
    "WorksFor": [{ "target_id": "...", "confidence": 0.85, "evidence": ["..."] }],
    "Confirmed": [{ "field": "email", "value": "jane@acme.com", "confidence": 0.95 }],
    "SelfGrade": { "overall_confidence": 0.78, "identity_locked": true, "evidence_strength": "strong" },
    "LeadScore": { "score": 72, "tier": "hot", "persona": "developer" }
    /* + 30+ more component blocks, each stamped with provenance */
  },
  "related": [
    { "kind": "organization", "id": "...", "FullName": { "value": "Acme Inc" }, "OwnsDomain": [...] },
    { "kind": "domain",       "id": "...", "Domain": { "value": "acme.com" }, "WhoisData": {...}, "HomepageData": {...} }
  ],
  "_meta": { "ticks_run": 4, "ran": {...}, "skipped": {...}, "failed": {...}, "elapsed_s": 137.4 }
}
```

Every component carries `_added_by` (which system emitted it), `_added_at` (UTC timestamp), `_confidence`, `_sources`, and `_evidence` - so the output is fully audit-able.

***

### How does Scout compare?

| | Scout | Apollo / ZoomInfo / Clearbit | Hunter.io | OSINT Industries |
|---|---|---|---|---|
| Data source | Public web only - every value tagged with origin | Proprietary database | Email guesses + verification | Public + private mix |
| API key required | None | Yes (paid) | Yes (paid) | Yes (paid) |
| Pricing | Per-result, cents per lead | $1K–$30K+ / yr per seat | $50–$500+ / mo | Subscription |
| Identity verification | Cross-source confidence + surfaced conflicts | Opaque "verified" stamps | Email-only | Mixed |
| Domain → team people | ✅ Spawn each + enrich in same run | Yes (db) | No | No |
| Email finder + SMTP verify | ✅ Built in | Yes (db) | ✅ Yes | No |
| Resume / PDF mining | ✅ Built in (incl. OCR) | Limited | No | No |
| GitHub depth | ✅ Commits, repos, co-authors, packages | None | No | Limited |
| OFAC / sanctions | ✅ Built in | Add-on | No | Yes |
| Output is auditable | ✅ Full ledger + per-component provenance | No | No | Mixed |
| Self-hostable | ✅ Run on your own Apify account | No | No | No |

Scout isn't a 200M-row contact DB. It's a *verifiable*, *one-shot* dossier per lead, paid by the result, with provenance your compliance team can audit.

***

### FAQ

**How do I find someone's email from just their name?**
Pass `full_name` and (optionally) `company_name` or `domain`. Scout probes public sources, mines documents and READMEs, runs cross-platform handle searches, and - when a corporate domain is involved - generates plausible work-email patterns and SMTP-verifies them. Success rate is highest when you provide at least one anchor beyond name (domain, company, or any handle).

**How does the org → team flow work?**
Pass `domain: "acme.com"`. Scout enriches the domain (WHOIS / DNS / hosting / tech stack / cert transparency) and *also* fetches the team / about / leadership pages with a real browser (so React-rendered content lands in the DOM), strips `<script>`/`<style>` via selectolax, runs spaCy NER + a strict heuristic name detector, pairs each name with the title text immediately following it, and spawns a Person entity for each teammate. The scheduler then enriches every spawned person on subsequent ticks - so a single domain input returns the org + N enriched team members.

**Does Scout work without a domain or email?**
Yes - pass just `full_name`. Scout anchors identity from public mentions, but `quality_gate.passed` will likely be `false` for very common names, and you'll see multiple alternative hypotheses. Cross-cultural common names are the hardest case.

**Is the output legally usable for outreach?**
Scout surfaces public data. Storing, redistributing, or selling that data may be regulated in your jurisdiction (GDPR, CCPA, PIPEDA, state data-broker laws). See "Use responsibly" below. Scout is a research tool - what you do with the output is your responsibility.

**Why don't I see LinkedIn personal-profile data even though the URL is in the output?**
LinkedIn aggressively gates personal profiles even with stealth + Google Referer. Scout records the URL but won't fake the content if LinkedIn returns 999. Company pages frequently come through; personal pages typically don't.

**How fast is it per lead?**
Default `perLeadTimeoutSeconds` is 180s. A single person input typically finishes in 30–90s. A domain input that spawns 5–10 team members can take 90–240s as Scout enriches each person. Lower the timeout for predictable cost; raise it for rich inputs you want fully exhausted.

**Does it work for non-English names?**
Yes. Name-alias handling covers Anglo, Slavic, Persian, and several other naming conventions (e.g. nickname-to-formal: "Liz" ↔ "Elizabeth"; cross-cultural diminutives across Slavic / Persian / Arabic given-name traditions). Romanized names work best; CJK names work but disambiguation is harder.

**Can I run it on a list of 10,000 leads?**
Yes - pass them as `entities: [...]`. Apify scales the actor automatically. Budget ~1.5 minutes per person and ~3–5 minutes per domain (because of team discovery). Account for proxy + upstream rate-limits at high concurrency.

**Why is some data missing on my run?**
Several upstream sources rate-limit single IPs. Apify Proxy is strongly recommended. Failures are recorded in `_meta.failed` and Scout degrades gracefully - you always get a record back.

**Will I be charged for runs that found nothing?**
No. Scout skips the per-result charge when `SelfGrade.overall_confidence` is below 0.2 and identity isn't locked. You'll still owe Apify's compute fee for the run, but the per-result fee that pays the actor is waived.

**Can I customize which sources run?**
Yes - pass a `processors` array to enable a subset of enrichment systems. Default empty = run everything.

***

### Limitations

- **Anti-bot variability.** LinkedIn personal profiles, regional LinkedIn subdomains, and aggressive anti-bot sites still 999/403 even with stealth + Referer. Scout records what it tried, never falsifies.
- **Identity ambiguity.** A common first name with no email, handle, or domain is hard to disambiguate. Scout will set `quality_gate.passed=false`, surface alternatives, and avoid guessing.
- **NER false positives on team pages.** spaCy's small English model occasionally tags marketing-copy phrases as PERSON. The post-NER filter catches most ("Front Row", "Calls Completed"); a few may slip through.
- **SMTP RCPT is unreliable in production.** Many providers accept-all or block from cloud IPs. A successful SMTP RCPT means "the server didn't reject" - not "this address truly works."
- **Public data only.** No customer-supplied API keys, no paid data brokers, no auth-walled content.

Scout never raises on a single source failure - you always get a result, with the gaps clearly marked.

***

### Use responsibly

The output describes real people and organizations. Your jurisdiction may regulate how this kind of data can be stored, redistributed, or sold (GDPR, CCPA, PIPEDA, state data-broker statutes). Scout is a research tool - operating it for due diligence, sales research, recruiting, or investigative work on parties with whom you have a legitimate interest is what it's built for. Bulk dataset construction or resale without lawful basis is on you.

Scout never bypasses authentication, never solves CAPTCHAs, and never exceeds public-rate-limit guidance. If a source returns "auth required" or rate-limits the request, the corresponding field is left null and the failure is recorded.

***

**Keywords:** lead enrichment, person enrichment, contact enrichment, email finder, email lookup, email verification, email finder waterfall, work email finder, OSINT tool, OSINT lead enrichment, people search, person lookup, identity verification, B2B contact data, B2B prospecting, LinkedIn enrichment, GitHub user lookup, sanctions screening, OFAC screening, WHOIS lookup, DNS lookup, company enrichment, firmographic enrichment, lead scoring, CRM enrichment, sales intelligence, due diligence, KYC, recruiting research, candidate enrichment, resume parser, structured resume parser, document mining, NER entity extraction, OCR PDF, cert transparency, subdomain enumeration, organization team discovery, domain to team, scout, dossier, prospecting, ZoomInfo alternative, Apollo alternative, RocketReach alternative, Clearbit alternative, Hunter.io alternative.

# Actor input Schema

## `full_name` (type: `string`):

Person's full name. Single most useful anchor on its own - try it with no other fields.

## `email` (type: `string`):

Email address. Auto-validated, breach-checked, MX + SMTP verified.

## `phone` (type: `string`):

Phone number in any format - auto-normalized to E.164.

## `first_name` (type: `string`):

Optional split - supply if you only have the parts.

## `last_name` (type: `string`):

Optional split - supply if you only have the parts.

## `company_name` (type: `string`):

Employer name. Combined with `full_name` enables the email-finder waterfall.

## `title` (type: `string`):

The person's title or role.

## `location` (type: `string`):

City, state, country - used for disambiguation against same-name profiles.

## `linkedin_url` (type: `string`):

Person's LinkedIn profile URL. Auto-canonicalised to https://<host>/in/<slug>.

## `github_username` (type: `string`):

GitHub handle (with or without leading @). Auto-lowercased.

## `twitter_handle` (type: `string`):

X (Twitter) handle. Auto-stripped of leading @.

## `domain` (type: `string`):

Company website domain. Scout enriches the domain (WHOIS, DNS, hosting, tech stack, cert transparency) AND scrapes the team page to spawn employee Person entities - turning a single domain input into a fully populated team graph.

## `name` (type: `string`):

Company / organization name. Combine with `domain` for richest output.

## `kind` (type: `string`):

Override the auto-inferred kind. Leave blank to let Scout decide based on which fields you filled.

## `notes` (type: `string`):

Free-form context that gets attached to the entity (visible in output).

## `entities` (type: `array`):

Bulk mode: an array of entity objects, each with the same vocabulary as the typed fields above. Useful when you want to enrich multiple leads in one run. Leave empty if you're using the typed single-entity fields above.

Example entry shapes:
{ "kind": "person", "full\_name": "Jane Doe", "email": "jane@acme.com" }
{ "kind": "domain", "domain": "example.com" }
{ "kind": "organization", "name": "Acme Inc", "domain": "acme.com" }

## `processors` (type: `array`):

Subset of enrichment systems to run. Default empty = run everything. Narrow this to control runtime / cost. Identity, parsing, routing, inference, scoring, and dedup always run regardless.

## `proxyConfiguration` (type: `object`):

Apify Proxy is strongly recommended - several upstream sources rate-limit single IPs and serve harder anti-bot challenges to cloud egress.

## `headless` (type: `boolean`):

Run Chromium headless. Stealth + Google Referer + randomised viewport are always applied. Disable only for local debugging.

## `verbose` (type: `boolean`):

Include per-source provenance + run telemetry in the output, and detailed pipeline logs in the run log. Default off - output stays clean and compact.

## `perLeadTimeoutSeconds` (type: `integer`):

Total runtime ceiling per top-level entity. Spawned related entities (employer, owned domain, team members) share the same budget. Increase for org→team workflows; decrease for tight cost control.

## Actor input object example

```json
{
  "full_name": "Jane Doe",
  "email": "jane@acme.com",
  "phone": "+1 (415) 555-1234",
  "company_name": "Acme Inc",
  "title": "Head of Engineering",
  "linkedin_url": "https://www.linkedin.com/in/jane-doe",
  "github_username": "janedoe",
  "domain": "acme.com",
  "name": "Acme Inc",
  "kind": "",
  "entities": [],
  "processors": [],
  "proxyConfiguration": {
    "useApifyProxy": true
  },
  "headless": true,
  "verbose": false,
  "perLeadTimeoutSeconds": 180
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("logical_vivacity/scout").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "proxyConfiguration": { "useApifyProxy": True } }

# Run the Actor and wait for it to finish
run = client.actor("logical_vivacity/scout").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call logical_vivacity/scout --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=logical_vivacity/scout",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Scout — Lead Enrichment + OSINT",
        "description": "Email finder + lead enrichment + OSINT from public sources. Pass any fragment — name, email, or domain — get a verified dossier: 700+ identity sites, SMTP-validated emails, document mining, sanctions screen, domain→team discovery. $0.05 person, $0.15 domain. No API keys",
        "version": "0.1",
        "x-build-id": "isbpxPZT7zuk04qEZ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/logical_vivacity~scout/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-logical_vivacity-scout",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/logical_vivacity~scout/runs": {
            "post": {
                "operationId": "runs-sync-logical_vivacity-scout",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/logical_vivacity~scout/run-sync": {
            "post": {
                "operationId": "run-sync-logical_vivacity-scout",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "full_name": {
                        "title": "Full name",
                        "type": "string",
                        "description": "Person's full name. Single most useful anchor on its own - try it with no other fields."
                    },
                    "email": {
                        "title": "Email",
                        "type": "string",
                        "description": "Email address. Auto-validated, breach-checked, MX + SMTP verified."
                    },
                    "phone": {
                        "title": "Phone",
                        "type": "string",
                        "description": "Phone number in any format - auto-normalized to E.164."
                    },
                    "first_name": {
                        "title": "First name",
                        "type": "string",
                        "description": "Optional split - supply if you only have the parts."
                    },
                    "last_name": {
                        "title": "Last name",
                        "type": "string",
                        "description": "Optional split - supply if you only have the parts."
                    },
                    "company_name": {
                        "title": "Company name",
                        "type": "string",
                        "description": "Employer name. Combined with `full_name` enables the email-finder waterfall."
                    },
                    "title": {
                        "title": "Job title",
                        "type": "string",
                        "description": "The person's title or role."
                    },
                    "location": {
                        "title": "Location",
                        "type": "string",
                        "description": "City, state, country - used for disambiguation against same-name profiles."
                    },
                    "linkedin_url": {
                        "title": "LinkedIn URL",
                        "type": "string",
                        "description": "Person's LinkedIn profile URL. Auto-canonicalised to https://<host>/in/<slug>."
                    },
                    "github_username": {
                        "title": "GitHub username",
                        "type": "string",
                        "description": "GitHub handle (with or without leading @). Auto-lowercased."
                    },
                    "twitter_handle": {
                        "title": "Twitter / X handle",
                        "type": "string",
                        "description": "X (Twitter) handle. Auto-stripped of leading @."
                    },
                    "domain": {
                        "title": "Domain",
                        "type": "string",
                        "description": "Company website domain. Scout enriches the domain (WHOIS, DNS, hosting, tech stack, cert transparency) AND scrapes the team page to spawn employee Person entities - turning a single domain input into a fully populated team graph."
                    },
                    "name": {
                        "title": "Organization name",
                        "type": "string",
                        "description": "Company / organization name. Combine with `domain` for richest output."
                    },
                    "kind": {
                        "title": "Entity kind (advanced)",
                        "enum": [
                            "",
                            "person",
                            "organization",
                            "domain",
                            "email",
                            "phone"
                        ],
                        "type": "string",
                        "description": "Override the auto-inferred kind. Leave blank to let Scout decide based on which fields you filled.",
                        "default": ""
                    },
                    "notes": {
                        "title": "Notes",
                        "type": "string",
                        "description": "Free-form context that gets attached to the entity (visible in output)."
                    },
                    "entities": {
                        "title": "Bulk input - entities array",
                        "type": "array",
                        "description": "Bulk mode: an array of entity objects, each with the same vocabulary as the typed fields above. Useful when you want to enrich multiple leads in one run. Leave empty if you're using the typed single-entity fields above.\n\nExample entry shapes:\n  { \"kind\": \"person\", \"full_name\": \"Jane Doe\", \"email\": \"jane@acme.com\" }\n  { \"kind\": \"domain\", \"domain\": \"example.com\" }\n  { \"kind\": \"organization\", \"name\": \"Acme Inc\", \"domain\": \"acme.com\" }",
                        "default": []
                    },
                    "processors": {
                        "title": "Active enrichment systems",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Subset of enrichment systems to run. Default empty = run everything. Narrow this to control runtime / cost. Identity, parsing, routing, inference, scoring, and dedup always run regardless.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "gravatar",
                                "breach_check",
                                "email_pattern",
                                "email_finder",
                                "keybase_profile",
                                "pgp_keyserver",
                                "email_smtp_verify",
                                "wikipedia",
                                "bluesky_profile",
                                "mastodon_profile",
                                "reddit_user",
                                "stackoverflow_user",
                                "devto_user",
                                "medium_user",
                                "hackernews_mentions",
                                "username_pivot",
                                "personal_domain_check",
                                "personal_domain_dive",
                                "dork_search",
                                "file_extract",
                                "filename_parse",
                                "resume_sections",
                                "ocr_fallback",
                                "doc_mentions",
                                "doc_entities",
                                "booking_link",
                                "newsletter",
                                "github_profile",
                                "github_activity",
                                "github_email_commits",
                                "github_repo_dive",
                                "connection_graph",
                                "package_ownership",
                                "handle_disambiguator",
                                "linkedin_person",
                                "socid_enrich",
                                "exif_avatar",
                                "identifier_harvest",
                                "sanctions_check",
                                "adverse_news",
                                "whois_dns",
                                "hosting",
                                "mx_provider",
                                "homepage",
                                "cdn_detect",
                                "subdomain_enum",
                                "cert_transparency",
                                "status_page",
                                "api_docs",
                                "wayback",
                                "github_org",
                                "linkedin_company",
                                "crunchbase_company",
                                "glassdoor_company",
                                "builtwith_company",
                                "ats_hiring",
                                "product_hunt",
                                "oss_repo_detect",
                                "org_people"
                            ]
                        },
                        "default": []
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify Proxy is strongly recommended - several upstream sources rate-limit single IPs and serve harder anti-bot challenges to cloud egress.",
                        "default": {
                            "useApifyProxy": true
                        }
                    },
                    "headless": {
                        "title": "Headless browser",
                        "type": "boolean",
                        "description": "Run Chromium headless. Stealth + Google Referer + randomised viewport are always applied. Disable only for local debugging.",
                        "default": true
                    },
                    "verbose": {
                        "title": "Verbose output",
                        "type": "boolean",
                        "description": "Include per-source provenance + run telemetry in the output, and detailed pipeline logs in the run log. Default off - output stays clean and compact.",
                        "default": false
                    },
                    "perLeadTimeoutSeconds": {
                        "title": "Per-entity timeout (seconds)",
                        "minimum": 30,
                        "maximum": 900,
                        "type": "integer",
                        "description": "Total runtime ceiling per top-level entity. Spawned related entities (employer, owned domain, team members) share the same budget. Increase for org→team workflows; decrease for tight cost control.",
                        "default": 180
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
