Backstory Lead Enrichment, Person Lookup & Company Intelligence avatar

Backstory Lead Enrichment, Person Lookup & Company Intelligence

Pricing

from $300.00 / 1,000 results

Go to Apify Store
Backstory Lead Enrichment, Person Lookup & Company Intelligence

Backstory Lead Enrichment, Person Lookup & Company Intelligence

Lead enrichment, person lookup & company intelligence from public sources. Pass any fragment — a name, email, domain, or handle — and get a structured dossier with verified identity, cross-platform handles, sanctions screening, and firmographics. No API keys. Pay per result.

Pricing

from $300.00 / 1,000 results

Rating

0.0

(0)

Developer

Logical Vivacity

Logical Vivacity

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Backstory — Lead Enrichment, Person Lookup & Company Intelligence

Find a person's verified email, GitHub, LinkedIn, and 40+ other handles from just a name. Enrich a domain into full firmographics. Cross-verify identity across multiple public sources with full provenance — no customer-supplied API keys, no per-seat licenses, pay only for what you run.

Hand Backstory any fragment you have on a contact — a name, an email, a domain, a phone, a handle — and it returns a structured JSON dossier with verified identity, cross-platform social presence, document evidence, sanctions / adverse-news screening, and company firmographics. All from public sources.

Use Backstory to:

  • Find someone's email and verified handles from a partial input.
  • Cross-verify a lead's identity from multiple independent sources before adding them to your CRM.
  • Enrich a domain into WHOIS + DNS + hosting + tech-stack + hiring signals + public LinkedIn data.
  • Run lightweight OFAC sanctions and adverse-news screens on a counterparty.
  • Build pre-call briefs that include audit-able provenance.
  • Investigate a person's public footprint across the web (OSINT).

What you put in (any subset)

{
"full_name": "...",
"email": "...",
"phone": "...",
"company_name": "...",
"domain": "...",
"linkedin_url": "...",
"twitter_handle": "...",
"github_username": "...",
"title": "...",
"location": "...",
"notes": "..."
}

Pass an array of these as the leads input. Every field optional. The thinner the input, the harder the actor works to anchor identity — but it will still try.

What you get back — 50+ signal blocks per lead

Identity & verification

  • Verified full name, email, location, and primary handles
  • Multi-source confirmation (which sources agree on each value)
  • Conflict detection — alternative candidates the actor didn't commit to
  • Per-field confidence scores + an overall identity_locked flag
  • Entity-type classification: developer / executive / academic / creator / marketer / unknown
  • Input-ambiguity score and suggested extra inputs when the input is too thin

Email enrichment

  • RFC syntax & deliverability checks
  • MX record validation
  • Disposable & public-mailbox detection
  • Domain-level breach exposure
  • Email-pattern detection (first.last@, flast@, etc.)
  • Best-effort SMTP RCPT verification of guessed work emails

Social & public presence

  • Bluesky, Mastodon, Reddit, Stack Overflow, Dev.to, Medium, Hacker News mentions
  • Keybase, PGP keyservers
  • Wikipedia / Wikidata extracts (with occupation, nationality, notable-for)
  • Cross-platform handle map: probes ~40 sites including GitHub, GitLab, npm, PyPI, Hugging Face, Kaggle, Behance, Twitch, YouTube, Substack, Steam, ProductHunt, Strava, Letterboxd, Goodreads
  • Every hit verified against the lead's known name + location to filter same-name false positives

Document evidence

  • Resume / CV / slide-deck discovery via public web search
  • PDF / DOCX / PPTX text extraction
  • Mined emails, phones, employers, education, and skills from documents

GitHub depth (for developer leads)

  • User profile + organization stats
  • Public repos, top languages, most-starred repo
  • Recent activity heatmap, timezone inference, most-active hour/day
  • Repository README mining for emails
  • Commit-author email extraction (skips users.noreply.github.com)
  • Co-author / collaborator graph

Compliance & risk

  • OFAC SDN sanctions screen (free public source)
  • Adverse-news search
  • Domain-level breach history

Company side (when a domain is in play)

  • WHOIS — registrar, creation/expiration, registrant org
  • DNS — A, MX, NS, TXT, SPF, DMARC
  • MX provider detection — Google Workspace, Microsoft 365, Zoho, self-hosted
  • Hosting — ASN, ASN organization, country
  • CDN detection
  • Tech-stack fingerprint from the homepage
  • Public LinkedIn company page — name, employee range, industry, HQ, about
  • ATS / hiring signals — Greenhouse, Lever, Ashby, Workable
  • Status pages, public API docs
  • ProductHunt history, web-archive timeline
  • Subdomain enumeration via certificate transparency

Use cases

  • Sales enrichment — paste a list of emails or names, get back ICP-ready records.
  • CRM hygiene — re-enrich existing leads weekly, surface stale or wrong-person records.
  • OSINT / due diligence — investigate a counterparty before signing a contract.
  • Compliance screening — quick OFAC and adverse-news pass on counterparties.
  • Investigative work — find someone's verified public footprint across the web.
  • Recruiting — enrich a candidate's public profile across 40+ platforms.
  • Lead qualification — entity-type + score to triage by persona.

Inputs

FieldTypeRequiredDescription
leadsarray of objectsyesPartial lead objects — any subset of identity fields. Each becomes one dataset record.
processorsarray of enumnoSubset of enrichment categories to run. Default: all of them.
proxyConfigurationproxy editornoApify Proxy is strongly recommended — several upstream sources rate-limit single IPs.
headlessbooleannoRun the browser headless. Default true.
perLeadTimeoutSecondsintegernoHard ceiling on total runtime per lead. Default 90.

Output sample

Trimmed example for an input like {"full_name": "Jane Doe", "email": "jane@example.com"}:

{
"full_name": "Jane Doe",
"email": "jane@example.com",
"github_username": "janedoe",
"linkedin_url": "https://linkedin.com/in/jane-doe",
"twitter_handle": "janedoe",
"location": "Berlin, Germany",
"company_name": "Acme Inc",
"quality_gate": { "passed": true, "ambiguity_score": 0.0 },
"entity_type": {
"primary": "developer",
"secondary": ["creator"],
"signals": { "developer": 1.7, "creator": 0.6 }
},
"evidence": {
"confirmed": {
"full_name": "Jane Doe",
"email": "jane@example.com",
"github_username": "janedoe"
},
"conflicts": { "full_name": ["Jane Doh"] }
},
"self_grade": {
"overall_confidence": 0.78,
"identity_locked": true,
"evidence_strength": "strong",
"rationale": [
"identity locked: 3/4 core fields ≥0.7",
"12 ledger entries across fields"
],
"field_confidence": {
"github_username": 1.0,
"email": 0.95,
"full_name": 0.85,
"linkedin_url": 0.7
}
},
"lead_score": { "score": 72, "tier": "hot", "persona": "developer" },
// plus: email_info, gravatar, breach_check, keybase, wikipedia,
// bluesky, mastodon, reddit_user, stackoverflow_user, devto_user,
// medium_user, hackernews, username_pivot, dork_search, parsed_files,
// github_user, github_org, github_activity, sanctions, adverse_news,
// homepage, whois, dns, hosting, mx_provider, cdn, status_page,
// api_docs, ats_hiring, product_hunt, linkedin_company, wayback,
// (50+ signal blocks total)
"processor_errors": {}
}

How does Backstory compare to Apollo, ZoomInfo, RocketReach, Clearbit, and Hunter.io?

BackstoryApollo / ZoomInfo / RocketReach / Clearbit
Data sourcePublic web only — every value tagged with its originMostly proprietary databases
API key requiredNoneYes (paid)
PricingPer-run, cents per leadPer-seat ($1K – $30K+ / yr)
Identity verificationCross-source confidence + surfaced conflictsOpaque "verified" stamps
Source provenanceYes — every value carries source listNo
OFAC / sanctionsBuilt inAdd-on or absent
Resume / PDF miningBuilt inLimited
GitHub depthBuilt in (commits, repos, co-authors)None
Output is auditableYes — full ledger + raw payloads in outputNo
Self-hostableYes (run on your own Apify account)No

Backstory isn't trying to replace a 200M-row contact database. It's giving you a verifiable, one-shot dossier per lead, paid by the run, with provenance your compliance team can audit.

FAQ

How do I find someone's email from just their name?
Pass full_name and (optionally) company_name or domain. Backstory probes public sources, mines documents and READMEs, runs cross-platform handle searches, and — when a corporate domain is involved — generates plausible work-email patterns and SMTP-verifies them. Success rate is highest when you provide at least one anchor beyond name (domain, company, or any handle).

Does Backstory work without a domain or email?
Yes — pass just full_name. The actor anchors identity from public mentions, but quality_gate.passed will likely be false for very common names, and you'll see multiple alternative hypotheses in entity_hypotheses. Cross-cultural common names are the hardest case.

Is the output legally usable for outreach?
The actor surfaces public data. Storing, redistributing, or selling that data may be regulated in your jurisdiction (GDPR, CCPA, PIPEDA, state data-broker laws). See "Use responsibly" below. Backstory is a research tool — what you do with the output is your responsibility.

How is this different from a regular LinkedIn scraper?
Backstory cross-references LinkedIn signals with GitHub, Gravatar, Keybase, breach data, ~40 platform probes, and document evidence to verify identity. A pure LinkedIn scraper gives you LinkedIn-says-so; Backstory gives you "five sources agree, two disagree, here's the conflict and the evidence."

What if the lead is on LinkedIn but locked behind auth?
The actor never bypasses authentication or solves CAPTCHAs. It reads only public Open Graph metadata from LinkedIn person pages and the public /about/ page for companies. If LinkedIn shows an auth wall, the relevant fields are left null and a reason is recorded.

How fast is it per lead?
Default perLeadTimeoutSeconds is 90s. Typical runs finish in 60–150s depending on input richness and upstream rate-limits. Lower the timeout for predictable cost; raise it for rich inputs you want fully exhausted.

Does it work for non-English / non-Anglo names?
Yes. Name-alias handling covers Anglo, Slavic, Persian, and several other naming conventions ("Pasha" matches "Pavel" / "Mohammad", "Liz" matches "Elizabeth", etc.). Romanized names work best; CJK names work but disambiguation is harder.

Can I run it on a list of 10,000 leads?
Yes — Apify scales the actor automatically. Budget ~1.5 minutes per lead, and account for proxy + upstream rate-limits at high concurrency.

Why is some data missing on my run?
Several upstream sources rate-limit single IPs. Apify Proxy is strongly recommended. Failures are recorded in processor_errors[<source>] and the actor degrades gracefully — you always get a record back.

Is it a free Hunter.io / Apollo / Clearbit alternative?
You pay only for the actor's compute time on Apify (cents per lead at most). There are no third-party API keys to manage and no monthly subscriptions. Coverage and accuracy will differ from those products — Backstory is breadth-first across public sources rather than backed by a private contact database.

Can I customize which sources run?
Yes — pass a processors array to enable a subset of enrichment categories. Default runs everything.

Limitations

  • Anti-bot variability. LinkedIn, Crunchbase, Glassdoor and similar sites aggressively gate public pages. When that happens, the relevant fields are left null and a reason is recorded in processor_errors. Re-running with Apify Proxy in a different region usually helps.
  • Identity ambiguity. A common first name with no email, handle, or domain is hard to disambiguate. The actor will set quality_gate.passed=false, surface alternatives in entity_hypotheses, and avoid guessing.
  • Public data only. No customer-supplied API keys, no paid data brokers, no auth-walled content.
  • Rate limits. Some upstream sources enforce per-IP limits. Failures are recorded; the actor degrades gracefully.
  • OFAC matching is fuzzy. It's a screening signal, not a legal determination. Verify any match against the original Treasury source link before acting on it.

The actor never raises on a single source failure — you always get a result, with the gaps clearly marked.

Use responsibly

The output describes real people and organizations. Your jurisdiction may regulate how this kind of data can be stored, redistributed, or sold (GDPR, CCPA, PIPEDA, state data-broker statutes). Backstory is a research tool — operating it for due diligence, sales research, recruiting, or investigative work on parties with whom you have a legitimate interest is what it's built for. Bulk dataset construction or resale without lawful basis is on you.

The actor never bypasses authentication, never solves CAPTCHAs, and never exceeds public-rate-limit guidance. If a source returns "auth required" or rate-limits the request, the corresponding field is left null.


Keywords: lead enrichment, person enrichment, contact enrichment, email finder, email lookup, email verification, OSINT tool, people search, person lookup, identity verification, B2B contact data, LinkedIn enrichment, GitHub user lookup, sanctions screening, OFAC screening, WHOIS lookup, DNS lookup, company enrichment, firmographic enrichment, lead scoring, CRM enrichment, sales intelligence, due diligence, KYC, recruiting research, investigative research, ZoomInfo alternative, Apollo alternative, RocketReach alternative, Clearbit alternative, Hunter.io alternative.