Person Data Enrichment — Email, Phone & LinkedIn avatar

Person Data Enrichment — Email, Phone & LinkedIn

Pricing

from $150.00 / 1,000 person enricheds

Go to Apify Store
Person Data Enrichment — Email, Phone & LinkedIn

Person Data Enrichment — Email, Phone & LinkedIn

Enriches persons from People Data Labs — cheaper than Clay, Apollo, or ZoomInfo. Name + company → verified work email, job title, seniority, phone, LinkedIn, and location. Pay-per-event: $0.03/person. No subscription.

Pricing

from $150.00 / 1,000 person enricheds

Rating

0.0

(0)

Developer

ryan clinton

ryan clinton

Maintained by Community

Actor stats

1

Bookmarked

54

Total users

28

Monthly active users

3 days ago

Last modified

Categories

Share

Person Data Enrichment

⚠️ Paid Apify plan required. This actor is restricted to users on any paid Apify plan (Starter / Scale / Business / Enterprise / Creator). Free-plan runs are blocked at start because Apify does not pay out developer revenue from free-plan users. See Apify pricing — even the entry tier unlocks the actor.

The deterministic contact intelligence engine that tells you exactly which records to use, which to drop, and why — in one run.

The cheapest way to enrich contacts and build a ready-to-use outreach list is this actor at $0.15 per match — it tells you exactly which leads to contact, verify, or drop in one run, with no subscription or manual filtering.

This works because the actor combines People Data Labs data with a deterministic decision layer that removes manual filtering — every record is automatically classified for outreach, verification, or rejection in the same run as the enrichment.

Most tools enrich data — this one tells you exactly which leads to contact, verify, or drop, deterministically, in the same run.

This actor enriches contacts using People Data Labs and deterministically tells you which leads to contact, verify, or drop — with full explainability and no subscription.

Explainable PDL-powered person enrichment with deterministic match strategies, a closed-loop decision layer (SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP), prebuilt segments, why-not-found intelligence, deduplication, cached refreshes, change triggers, and charge-safe quality filters — the same data source behind Clay's most-used enrichment action, at $0.15 per person versus Clay's $0.22–$5.63. Provide a list of names, emails, or LinkedIn URLs and get back verified work emails, job titles, seniority levels, phone numbers, company info, social profiles, and location data. No subscription required. Pay only for successful matches.

This actor calls the PDL Person Enrich API as the primary match method, then automatically falls back to PDL's Elasticsearch-based Person Search API when the direct lookup yields no result, and finally retries Search under known company aliases (Facebook → Meta, X → Twitter, etc.). Every record includes a matchConfidence score (0–10) so you always know how certain the match is. Batch up to 1,000 people in a single run, download as JSON or CSV, and plug into your CRM without manual cleanup.

Quick summary

This is a Clay alternative that enriches contacts and automatically tells you which leads to contact, verify, or drop — in one API call at $0.15 per match. Use this to go from raw names to a ready-to-send outreach list in one run.

  • Input: list of names, emails, or LinkedIn URLs (with optional company / domain identifiers)
  • Output: enriched contact data + composite confidence score + decision (SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP) per record
  • Cost: $0.15 per successfully enriched person (pay-per-event, no subscription, not-found rows are free)
  • Match rate: 70–85% for US/UK enterprise contacts; 40–65% for smaller companies and non-English markets
  • Best for: SDRs who want enriched contacts AND a ready-to-use outreach list — enrichment + qualification + decisioning in a single run, no separate filtering step
  • Also great for: CRM enrichment + hygiene, recruiting / talent sourcing, weekly job-change monitoring, scheduled re-engagement campaigns
  • Avoid if: you need multi-source waterfall (use Waterfall Contact Enrichment) or a UI-driven prospecting platform like Apollo / ZoomInfo

When should you use this actor?

Use Person Enrichment Lookup if you need:

  • The cheapest way to enrich B2B contacts with PDL data — $0.15/match vs Clay's $0.22–$5.63
  • A Clay alternative without subscriptions or platform fees — pure pay-per-event, you only pay for matches
  • Deterministic enrichment with explainable confidence scoring — same input always produces the same score, with per-factor breakdown
  • A way to decide which leads to contact vs drop automatically — built-in Decision Layer assigns recommendedAction to every record
  • Weekly job-change monitoring for CRM contacts — schedule with compareToPrevRun: true and get back JOB_CHANGED / PROMOTION triggers
  • An API-first enrichment primitive for your own pipelines — clean structured JSON, idempotent webhooks, dataset views per use case

Avoid this actor if:

  • You need multi-source enrichment cascading across Apollo + Cognism + Hunter + Datagma — use Waterfall Contact Enrichment instead
  • You want a full UI-driven prospecting platform with intent signals, lists, and CRM sync — use Apollo or ZoomInfo
  • You need real-time intent signals (hiring, funding, news) — use Intent Signal Tracker
  • You're scraping LinkedIn directly (TOS violation; this actor only queries the PDL API)

How it compares: this actor vs Clay vs Apollo vs ZoomInfo

The cheapest alternative to Clay for PDL-based enrichment is this actor at $0.15 per match — 2× to 30× cheaper than Clay depending on plan tier — with no subscription, no platform fee, and no cost for failed lookups, making it the lowest-cost way to run PDL enrichment at scale. It is also a low-cost alternative to Apollo ($99–$249/month) and ZoomInfo ($15,000+/year) for teams that want raw enrichment + qualification without the platform overhead. Same data source as Clay's most-used PDL action, plus deterministic scoring, built-in decisioning, and no subscription.

No subscriptions, no workflows, and no scoring rules — just enrich and get a ready-to-use outreach list. Unlike Clay, Apollo, or ZoomInfo, the decision layer is computed per record, deterministically, in the same run as the enrichment.

FeatureThis actorClayApolloZoomInfo
Tells you who to contact (vs verify vs drop)Yes (deterministic)NoNoNo
Data sourcePeople Data LabsPDL + 50+ providersProprietaryProprietary
Pricing$0.15 per match$0.22–$5.63 per PDL enrichment$99–$249/month$15,000+/year
Subscription requiredNoYes (platform fee)YesYes (annual)
Pay only for matchesYesMixedNoNo
Deterministic confidence scoringYes (0–100, audited weights)NoNoNo
Match explainability + alternatesYes (matchDebug.alternates[])LimitedNoneNone
Decision layer (what to do next)Yes (recommendedAction enum)ManualNoNo
Why-not-found classificationYes (notFoundAnalysis)NoNoNo
Cross-run change monitoringYes (changeFlags[] + triggers)Partial (workflows)NoLimited
Email MX + disposable validationYes (built-in)Add-onAdd-onAdd-on
Filter-before-chargeYes (free filtering)NoN/AN/A
API-first / webhook-nativeYesPartialPartialAPI tier only
Bulk CSV uploadUp to 1,000 / runYesYesYes

Why this actor is different

Most B2B enrichment tools return raw data and leave the workflow to you. This actor closes the loop:

  • Tells you what to do with each record — every row carries a recommendedAction (SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP), an actionPriority, and a one-sentence actionReason. Branch your Slack / Zapier / CRM rules directly on the action enum
  • Filters BEFORE chargingrequireFields[], minConfidenceScore, and changeFlagsFilter apply after enrichment but before PPE charging, so weak records cost zero
  • Explains every match — opt-in matchDebug block surfaces which input signals were queried, how many candidates PDL returned, why the winner was selected, and (with includeCandidates) the alternates that were rejected
  • Deterministic — same input always produces the same output — no LLM calls, no per-user weight magic, no hidden randomness. Confidence scores are comparable across teams and runs

Key advantages

  • Deterministic scoring — same input always produces the same result
  • Pay only for matches — failed lookups cost zero
  • Built-in decision layer — no manual filtering or rules engine required
  • Full match explainability — see exactly why each record was selected and which alternates were rejected
  • Filter-before-charge — requireFields, minConfidenceScore, and changeFlagsFilter apply before billing fires
  • Cross-run change detection — schedule weekly to catch job changes, promotions, and email changes while still timely
  • API-first — clean structured JSON, idempotent webhooks, named dataset views

Capabilities

Contact enrichment (PDL-powered) · Email discovery + MX / disposable / role-account validation · Composite confidence scoring (0–100, audited weights) · Decision layer (deterministic action enum) · Lead segmentation (6 prebuilt cohorts) · Why-not-found classification · Identity resolution + input-quality scoring · Input deduplication (strict + fuzzy modes) · Match strategy presets (strict / balanced / aggressive) · Match explainability with rejected alternates · Company-name alias retry (~70 alias groups) · Cross-run change detection (job changes, promotions, email changes) · Trigger events with priority + recommended action · Cached-snapshot refresh skipping · Charge-safe quality filters · Cohort insights for dashboards · CRM enrichment pipelines · Outreach decisioning · Scheduled monitoring

Beyond raw PDL coverage, the actor ships fifteen built-in intelligence layers that competitors charge separately for — every layer is deterministic, auditable, and adds zero per-record cost beyond the base $0.15:

  • Decision Layer (closes the loop) — every record carries a recommendedAction enum (SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP), actionPriority (high/medium/low), and actionReason (one sentence). Deterministic — same record always produces the same action. Branch downstream automation on recommendedAction = 'SEND_TO_OUTREACH' and you have a clean automation gate.
  • Prebuilt segments — every run computes 6 ready-to-use cohort lists in SUMMARY: decision-makers (VP+/Director/C-suite), contactable-high-confidence, promotion-triggers, new-job-movers, requires-verification, low-quality-drop-candidates. Each carries name + count + memberKeys for direct use in CRM list-builders. Empty segments are dropped from the output.
  • Why-Not-Found intelligencenot_found records carry a notFoundAnalysis block: likelyCause enum (insufficient_identifiers / company_mismatch / name_common / no_pdl_coverage / pdl_search_below_threshold) + causeConfidence (0–1) + suggestion. Ops teams stop guessing why their list missed.
  • Plain-English confidence narrativeconfidenceReason ("Strong match: exact PDL match (likelihood 9/10) + work email with valid MX") + confidenceRisks[] stable enum tags (weak_pdl_likelihood / mx_invalid / disposable_email / role_account / common_name / thin_input_identifiers / email_not_validated). Humans don't think in weights — they read sentences.
  • Identity strength scoreidentityStrength (0–100) + identitySignals[] measure the quality of the INPUT, distinct from confidenceScore which measures the quality of the MATCH. Same low score, different fix: weak input → add identifiers; weak match → PDL has thin coverage.
  • Deterministic match strategy presets (strict / balanced / aggressive) — locks PDL min_likelihood, search size, and fallback behaviour into auditable presets so cross-user confidenceScore stays comparable
  • Email quality validation — DNS MX check + bundled 100-domain disposable list + 50-prefix role-account detection (info@, sales@) on every returned email, with per-domain caching so a batch of 50 contacts at apify.com resolves the domain once
  • Composite confidence score — single 0–100 number combining PDL likelihood, email validity, and identifier richness; sort, filter, or gate outreach with one column. Paired with confidenceLevel band (high / medium / low) and confidenceBreakdown per-factor audit
  • Company-name alias retry — when literal PDL Search misses on "Facebook" the actor automatically retries with "Meta Platforms"; ~70 alias groups (Alphabet ↔ Google, X ↔ Twitter, Block ↔ Square) baked in
  • Match explainability with alternates — opt-in matchDebug block per record: which input signals were queried, how many candidates PDL returned, why the winner was selected. Set includeCandidates to surface up to 4 rejected alternates under matchDebug.alternates[] (never charged, never separate dataset rows)
  • Charge-safe quality filtersrequireFields[], minConfidenceScore, and changeFlagsFilter all run AFTER enrichment + validation but BEFORE pushData and PPE charging, so low-confidence or unwanted records cost zero
  • Input deduplicationdedupeMode: 'strict' | 'fuzzy' collapses duplicate input rows; every output row carries a duplicateGroupId so you can join the collapsed entries back
  • Cached-snapshot refresh skippingskipPreviouslyEnriched: true echoes records that were enriched recently (within freshnessWindowDays) without re-hitting PDL; flagged as fromCachedSnapshot: true and not charged
  • Cross-run change detection + trigger events — opt-in snapshot keyed by stable PDL ID. Schedule the actor weekly and get back JOB_CHANGED, PROMOTION, EMAIL_CHANGED, LOCATION_CHANGED flags on every record, plus a triggerEvents[] array with priority (high/medium/low), recommendedAction, and a Slack-ready summary string per event
  • Cohort insights in SUMMARY — every run writes top industries, top company-size bands, top countries, seniority distribution, contactable-rate percentage, average confidence, and duplicate-rate percentage to the SUMMARY KV value for dashboard / GTM-team consumption

What is the cheapest way to enrich contacts?

The cheapest way to enrich contacts and build a ready-to-use outreach list is this actor at $0.15 per match — it tells you exactly which leads to contact, verify, or drop in one run, with no subscription or manual filtering.

This works because the actor combines People Data Labs data with a deterministic decision layer, so every record is automatically classified for outreach, verification, or rejection in the same run as the enrichment.

Common questions

How do I enrich a list of names with verified work emails?

For most B2B workflows, the best way to enrich contacts is a PDL-based API — this actor adds built-in decisioning on top, so you get a ready-to-use outreach list instead of raw data. Pass the list with at least name + company (or name + domain) per row, set validateEmails: true, and filter the output with confidenceScore >= 50 (or recommendedAction = 'SEND_TO_OUTREACH'). The actor returns the work email, runs an MX check on the domain, flags disposable addresses, and assigns a deterministic action so you can pipe SEND_TO_OUTREACH rows straight into your sequencing tool.

What's the cheapest way to build an SDR outreach list?

The cheapest way to build an SDR outreach list is to enrich contacts and automatically filter to SEND_TO_OUTREACH records — this actor does both in one run at $0.15 per match, with no CRM filtering, spreadsheets, or manual scoring. Pass your prospect list with validateEmails: true and requireFields: ["email"], and only the records that pass the deterministic Decision Layer reach pushData (and PPE charging). Filtered records cost zero.

How do I know which leads to actually contact?

Read the recommendedAction field on every record:

  • SEND_TO_OUTREACH — high confidence + valid deliverable email; safe to add to a sequence
  • VERIFY_EMAIL — moderate confidence; run through Bulk Email Verifier first
  • RESEARCH_MANUALLY — no work email but LinkedIn URL (or PDL had nothing); reach out via LinkedIn or research the person
  • ENRICH_AGAIN — transient PDL failure; re-run the row
  • DROP — low confidence or disposable email; skip

No scoring model, CRM rules, or workflows required — the decision is computed per record, deterministically, the same way every time. The actionReason field carries a one-sentence explanation usable directly in CRM tasks, Slack messages, or AI agent prompts.

Why were some contacts not found?

Every not_found record carries a notFoundAnalysis block with a likelyCause enum and a suggestion string. The five causes:

  • insufficient_identifiers — input only had a name; add company / domain / email / LinkedIn
  • name_common — common first name + only company; add the company domain to disambiguate
  • company_mismatch — try the company's parent or DBA name (Meta vs Facebook), or add the website domain
  • no_pdl_coverage — PDL has no record (common for freelancers, very small companies, non-English markets); try Email Pattern Finder
  • pdl_search_below_threshold — PDL Search returned candidates but none cleared the strategy preset's likelihood floor; re-run with matchStrategy: 'aggressive' to relax the floor

How do I detect job changes and promotions?

Set compareToPrevRun: true and schedule the actor weekly. On the second run onwards, every record carries a changeFlags[] array (JOB_CHANGED, PROMOTION, TITLE_CHANGED, EMAIL_CHANGED, etc.) plus a triggerEvents[] array with priority + Slack-ready summary per change. Combine with changeFlagsFilter: ["JOB_CHANGED", "PROMOTION"] to only save (and pay for) rows where something actually changed.

How do I cut my enrichment cost on weekly schedules?

Set skipPreviouslyEnriched: true with freshnessWindowDays: 30. Records the actor enriched in the last 30 days are echoed from the KV snapshot without re-hitting PDL — flagged as fromCachedSnapshot: true and not charged. On a weekly schedule against a stable list, this typically cuts cost 80%+ after the first run.

How do I ship a "decision-makers only" list to my SDR team?

Read the segments array from the SUMMARY KV record after the run. The decision_makers segment carries the count + memberKeys (join-back keys) for every record where seniority is c_suite, vp, or director. Five other prebuilt segments are computed every run: contactable_high_confidence, promotion_triggers, new_job_movers, requires_verification, low_quality_drop_candidates.

What data can you extract?

Data PointSourceExample
📧 Work emailPDL verifiedm.okonkwo@pinnaclegroup.com
📧 Personal emailPDL verifiedmichael.okonkwo@gmail.com
👤 Full namePDL canonicalMichael Okonkwo
💼 Job titlePDL currentVP of Engineering
📊 Seniority levelPDL normalisedvp
🏢 DepartmentPDL job sub-rolesoftware
🏢 Company namePDL current employerPinnacle Group
🌐 Company domainPDL verifiedpinnaclegroup.com
🏭 Company industryPDL SIC mappinginformation technology
👥 Company sizePDL band1001-5000
🔗 LinkedIn URLPDL social graphlinkedin.com/in/mokonkwo
🐦 Twitter URLPDL social graphtwitter.com/mokonkwo
💻 GitHub URLPDL social graphgithub.com/mokonkwo
📍 LocationPDL geoSan Francisco, California, US
📞 Phone / MobilePDL verified+1-415-555-0182
🎓 Education (optional)PDL academicUC Berkeley — B.S. Computer Science
📋 Work history (optional)PDL experienceGoogle → Stripe → Pinnacle Group
🛠️ Skills (optional)PDL profile["Python", "Kubernetes", "AWS"]
🎯 PDL match confidenceRaw likelihood8 (scale 0–10)
💯 Composite confidence0–100 weighted score83
🟢 Confidence levelBand: high / medium / lowhigh (≥75), medium (≥50), low
Is contactableReachable + score ≥50 + not disposabletrue / false
📬 Email MX validDNS lookup confirms mail servertrue / false
🚫 Disposable emailTemp-mail / burner providerfalse
👥 Role accountGeneric local-part (info@, sales@)false
🔄 Change flagsCross-run delta codes["PROMOTION", "EMAIL_CHANGED"]
📅 Days since last seenTime between snapshots7

Why use Person Enrichment Lookup?

Manual person research takes 5–15 minutes per contact: LinkedIn search, company lookup, guessing email formats, cross-checking data sources. For a list of 200 people, that is 20–50 hours of work that still produces patchy results.

Clay automates this same workflow, but prices PDL enrichment at $0.22–$5.63 per person depending on your plan tier. Apollo charges $99–$249/month for similar data. ZoomInfo starts at $15,000/year. This actor uses the same PDL database and charges $0.15 per successfully enriched person — no subscription, no monthly minimum, no wasted spend on lookups that return nothing.

  • Scheduling — run daily, weekly, or custom intervals to keep CRM data fresh as people change roles
  • API access — trigger enrichment runs from Python, JavaScript, or any HTTP client with a single API call
  • Proxy rotation — Apify's built-in infrastructure handles outbound requests reliably at scale
  • Monitoring — get Slack or email alerts when runs fail or when match rates drop unexpectedly
  • Integrations — connect output directly to Zapier, Make, HubSpot, Google Sheets, or webhooks

Features

  • Triple-API matching strategy — tries PDL Person Enrich (exact lookup by email/LinkedIn/name) first, falls back to PDL Person Search (Elasticsearch DSL with must clauses on first_name, last_name, job_company_name, job_company_website), then retries Search under known company aliases (Facebook→Meta, X→Twitter, etc.) to maximise match rate
  • Identifier flexibility — accepts any combination of name, email, company name, domain, and LinkedIn URL; more identifiers produce higher match confidence
  • Minimum likelihood filter — queries PDL with min_likelihood=2 and required=name to avoid low-confidence junk matches polluting your dataset
  • Seniority normalisation — maps PDL's raw job_title_levels array to a single canonical value using a priority ladder: c_suite → vp → director → manager → senior → entry → training
  • Email type separation — splits PDL's typed emails array into distinct email (work) and personalEmail fields, using work/professional type flags
  • Email quality validation (validateEmails: true) — DNS MX lookup confirms the email's domain has a mail server, a bundled list flags ~100 known disposable / temp-mail domains, and a 50-prefix table flags generic role accounts (info@, sales@, support@). MX results are cached per domain so a batch of 50 contacts at the same company resolves the domain once.
  • Composite confidence score — every record gets a 0–100 score combining PDL likelihood (60% weight) + email present (10%) + MX valid (15%) + identifier richness (5%) − disposable penalty (30%) − role-account penalty (10%), plus a confidenceLevel band (high ≥75 / medium ≥50 / low <50) and a confidenceBreakdown showing each factor's contribution
  • Is-contactable gate — single boolean true when the record has any reachable signal (email/phone/LinkedIn) AND scores ≥50 AND is not disposable. Filter your CSV with one column instead of stacking three checkboxes.
  • Company-name alias retry (useCompanyAliases: true, default on) — when literal PDL Search returns 0 hits, the actor retries with known aliases for ~70 well-known companies. Catches PDL records that lag behind rebrands (Facebook records updated to "Meta Platforms"; Twitter records still under "X Corp").
  • Cross-run change detection (compareToPrevRun: true) — Section AE pattern. Snapshots every enriched person to a named Apify KV store keyed by pdlId (with name@domain fallback), then on the next run diffs the new enrichment against the prior snapshot. Emits a stable changeFlags[] enum on every record: NEW_PERSON, JOB_CHANGED, TITLE_CHANGED, PROMOTION, DEMOTION, EMAIL_CHANGED, EMAIL_GAINED, EMAIL_LOST, COMPANY_DOMAIN_CHANGED, LOCATION_CHANGED, NEW_LINKEDIN, UNCHANGED. Each record also carries a changeSinceLastRun block with previousJobTitle, previousCompany, previousSeniority, previousEmail, seniorityDirection (up/down/flat), daysSinceLastSeen. Schedule weekly to convert the actor from one-shot enrichment into a job-change monitor.
  • Filter-before-chargechangeFlagsFilter (e.g. ["JOB_CHANGED", "PROMOTION"]) and minConfidenceScore filters apply BEFORE pushData and BEFORE PPE charging, so low-confidence or unchanged records cost nothing
  • Concrete recommendations on failures — every not_found / error record carries a recommendation string with a specific next step ("Try Email Pattern Finder for a probable address from the company domain", "Add your own pdlApiKey", etc.) and a stable failureType enum so SQL/Sheets/Zapier rules can branch on the cause
  • Rate-limit + 5xx resilience — exponential-backoff retry (3 attempts on 429 honouring Retry-After, 2 attempts on 5xx, 2 attempts on network errors) instead of a single 429 retry, so transient PDL hiccups don't surface as not_found rows
  • Circuit breaker — after 5 consecutive PDL API errors the run stops cleanly with a clear status message, preventing wasted compute on dead upstreams
  • Spending limit awareness — checks Apify's eventChargeLimitReached flag after each successful enrichment and stops cleanly without losing already-pushed data
  • Per-item push then charge — every record is pushed to the dataset BEFORE the PPE event fires, so you can never be charged for data that wasn't saved
  • SUMMARY + OUTPUT in KV store — machine-readable run summary (matchRatePct, sourceCounts, failureCounts, confidenceCounts, changeFlagCounts, ppeChargesUsd) written to SUMMARY and OUTPUT keys for orchestrator quick-read; the dataset itself stays clean for spreadsheet exports
  • Configurable payload size — work history, education, and skills are disabled by default and enabled individually to keep payloads lean for high-volume runs
  • Input echo — every output row includes inputName, inputEmail, inputCompany, inputDomain so you can join enriched results back to your original list without manual matching
  • Not-found rows kept — unmatched people produce a source: "not_found" row with nulls rather than being silently dropped, keeping your row count consistent
  • Custom API key support — bring your own PDL API key (free tier: 100 calls/month from PDL directly) or use the built-in shared key; BYOK removes the shared monthly ceiling

Use cases for person enrichment lookup

Sales prospecting and SDR list building

SDRs and BDRs often receive account lists with only company names and contact names — no emails, no titles, no direct dial numbers. Run this actor against your target list before any outreach sequence starts. Get verified work emails, job titles, and seniority levels back in minutes, then filter to decision-makers (VP and above) before importing to your sales engagement platform.

Marketing agency lead generation

Agencies building prospect databases for clients need enriched contacts at volume. Feed this actor a CSV of names from LinkedIn Sales Navigator exports or event attendee lists. The dual-API fallback strategy catches contacts that PDL's direct enrich misses, raising effective match rates by 15–25% over single-endpoint approaches.

Recruiting and talent sourcing

Recruiters with candidate names but no contact details can enrich a shortlist in one run. The optional includeWorkHistory flag surfaces full career timelines — useful for quickly assessing trajectory without opening each LinkedIn profile manually. Enable includeSkills to match candidates to open role requirements programmatically.

CRM data enrichment and hygiene

Stale CRM records where contacts have changed jobs are a constant problem. Export your contact list, run it through this actor, and compare jobTitle and companyName against your stored values. Records where enriched company data diverges from CRM data flag contacts who have likely moved on. Combine with HubSpot Lead Pusher to write updates back automatically.

Job-change and promotion monitoring

Set compareToPrevRun: true, schedule the actor weekly against your CRM contact list, and the actor will diff every enrichment against the prior snapshot. Records where someone moved companies come back with changeFlags: ["JOB_CHANGED"]. Promotions surface as ["PROMOTION"] (seniority moved from manager to director, etc.). Combine compareToPrevRun: true with changeFlagsFilter: ["JOB_CHANGED", "PROMOTION"] to only save records where something actually changed — costs $0 on weeks when nobody in your CRM moved. Pipe the filtered output to Slack via the Apify Slack integration for a weekly "who moved this week" digest.

B2B lead scoring and qualification

Raw contact lists need context to be useful. After enrichment, use seniority to score contacts by decision-making authority, companySize to filter by ideal customer profile, and companyIndustry to segment by vertical. Pipe the enriched output into B2B Lead Qualifier for a 0–100 composite score built from 30+ signals.

Research and due diligence

Analysts investigating company leadership, founders, or board members need structured data fast. Batch enrichment of key individuals at a target company surfaces their career history, education, and professional networks. The matchConfidence score tells you when to trust the result and when to verify manually.

How to enrich person data

  1. Enter your person list — Click the persons field and paste a JSON array. Each object needs at minimum a name plus one of company, domain, or email. Example: [{"name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com"}]. More identifiers per person means higher match confidence.

  2. Configure optional fields — Toggle on includeWorkHistory, includeEducation, or includeSkills if you need the richer data. Leave them off for faster, smaller-payload runs. Set maxPersons to cap spend if you are testing a large list.

  3. Run the actor — Click "Start" and wait. Processing runs at 5 people per second (200ms between PDL API calls). A list of 100 people typically completes in 25–35 seconds. A list of 1,000 completes in under 10 minutes.

  4. Download results — Go to the Dataset tab, then export as JSON, CSV, or Excel. Filter the export to exclude source: "not_found" rows if you only want matched records. Join back to your original list using the inputEmail or inputName columns.

Input parameters

ParameterTypeRequiredDefaultDescription
personsarrayYesArray of person objects to enrich. Each object: name, email, company, domain, linkedinUrl (all optional, but at least one identity signal required)
pdlApiKeystringNoBuilt-inYour PDL API key. Built-in key covers up to 100 lookups/month. Bring your own key for higher volume
maxPersonsintegerNo100Maximum persons to process in one run (1–1,000). Safety cap to prevent accidental over-billing
includeWorkHistorybooleanNofalseInclude full job history array in output. Each entry: company, title, startDate, endDate
includeEducationbooleanNofalseInclude education history (school, degree, field of study) in output
includeSkillsbooleanNofalseInclude skills array from PDL profile in output
validateEmailsbooleanNofalseDNS MX check + disposable + role-account flag on every returned email; powers the composite confidenceScore
useCompanyAliasesbooleanNotrueRetry PDL Search under known aliases when literal name returns 0 hits (Facebook→Meta, etc.)
compareToPrevRunbooleanNofalseSnapshot to KV store, diff against prior run, emit changeFlags[] + changeSinceLastRun
monitorStateKeystringNoperson-enrichment-monitorOverride the KV store name for cross-run snapshots — use different names per CRM segment
changeFlagsFilterarrayNoOnly save records matching at least one of these change flags (e.g. ["JOB_CHANGED", "PROMOTION"]); filtered records are not charged
minConfidenceScoreintegerNopreset defaultOnly save records with composite score ≥ N (recommended floor for cold outreach: 50). Defaults to the matchStrategy preset's floor (strict = 75, others = none)
matchStrategyenumNobalancedLocks PDL min_likelihood, search size, and alias retry into auditable presets. strict = high-confidence only (likelihood ≥6, default minConfidenceScore=75); balanced = current default (likelihood ≥2, full fallback); aggressive = max coverage (likelihood ≥1, top-5 candidates, accepts lower-confidence matches when company domain matches input)
requireFieldsarrayNoSkip + don't charge records missing any of these fields. Filter runs AFTER enrichment + validation, BEFORE pushData + charge. Allowed values: email, personalEmail, phone, mobilePhone, linkedinUrl, jobTitle, companyName, companyDomain
dedupeModeenumNooffoff = enrich every input row. strict = collapse exact (lowercased) name + email + linkedin matches. fuzzy = also collapses whitespace, punctuation, and company-name suffixes (Inc/Corp/Ltd) so "Acme" + acme.com == "Acme Corp" + acme.com
skipPreviouslyEnrichedbooleanNofalseEcho cached values from monitor state instead of re-hitting PDL when a snapshot exists within freshnessWindowDays. Massive cost saver on weekly schedules. Requires a populated monitor state; first run on a new state key always hits PDL
freshnessWindowDaysintegerNo30How many days a snapshot stays "fresh" enough to skip on rerun (only used when skipPreviouslyEnriched is on). Lower = more re-enrichment, higher = more cost savings but staler data
includeCandidatesbooleanNofalseFetch up to 5 candidates per person from PDL Search and surface alternates under matchDebug.alternates[]. Alternates are NOT charged, NOT separate dataset rows, NOT enriched further — they exist purely so you can audit the winning match
includeMatchDebugbooleanNofalseAttach a matchDebug block to every record with strategyPreset, inputSignalsUsed, candidatesConsidered, and selectedReason. Auto-enabled when includeCandidates is on

Input examples

Enrich by name and company (most common):

{
"persons": [
{ "name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com" },
{ "name": "James Okafor", "company": "Beta Industries", "domain": "betaindustries.io" },
{ "name": "Priya Nair", "company": "Vertex Solutions" }
],
"maxPersons": 100
}

Enrich by email with full profile data:

{
"persons": [
{ "email": "s.chen@acmecorp.com" },
{ "email": "james.okafor@betaindustries.io" },
{ "linkedinUrl": "https://www.linkedin.com/in/priya-nair-vertex" }
],
"includeWorkHistory": true,
"includeEducation": true,
"includeSkills": true,
"maxPersons": 50
}

Quick test — single person:

{
"persons": [
{ "name": "Jan Curn", "company": "Apify", "domain": "apify.com" }
],
"maxPersons": 1
}

Weekly job-change monitor (scheduled run):

{
"persons": [
{ "name": "Sarah Chen", "company": "Acme Corp" },
{ "name": "James Okafor", "company": "Beta Industries" },
{ "name": "Priya Nair", "company": "Vertex Solutions" }
],
"compareToPrevRun": true,
"monitorStateKey": "key-accounts-q2",
"changeFlagsFilter": ["JOB_CHANGED", "PROMOTION", "EMAIL_CHANGED"],
"validateEmails": true
}

High-deliverability outreach list:

{
"persons": [{ "name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com" }],
"validateEmails": true,
"minConfidenceScore": 75
}

Input tips

  • More identifiers improve match accuracy — providing name + company + domain together is significantly more reliable than name alone; email or LinkedIn URL give the highest confidence matches
  • Use domain alongside company name — the actor passes the domain as both website and a fallback company parameter to PDL, which increases match rate for companies with non-obvious names
  • Set maxPersons when testing — start with 5–10 to verify output quality before processing your full list; you only pay for successful enrichments
  • Batch in one run — processing 500 people in one run is faster and cheaper than 500 individual runs due to Apify platform overhead
  • Leave optional fields off for large batchesincludeWorkHistory and includeSkills add significant payload size; only enable them when you need that data specifically

Output example

{
"inputName": "Sarah Chen",
"inputEmail": null,
"inputCompany": "Acme Corp",
"inputDomain": "acmecorp.com",
"fullName": "Sarah Chen",
"firstName": "Sarah",
"lastName": "Chen",
"email": "s.chen@acmecorp.com",
"personalEmail": "sarah.chen.personal@gmail.com",
"phone": "+1-415-555-0147",
"mobilePhone": "+1-415-555-0147",
"jobTitle": "Head of Product Marketing",
"jobTitleRole": "marketing",
"seniority": "director",
"department": "product marketing",
"companyName": "Acme Corp",
"companyDomain": "acmecorp.com",
"companyIndustry": "software / saas",
"companySize": "501-1000",
"companyLinkedinUrl": "https://www.linkedin.com/company/acme-corp",
"linkedinUrl": "https://www.linkedin.com/in/sarahchen-pm",
"twitterUrl": "https://twitter.com/sarahchenpm",
"githubUrl": null,
"facebookUrl": null,
"location": "San Francisco, California, United States",
"city": "San Francisco",
"state": "California",
"country": "United States",
"workHistory": [
{ "company": "Stripe", "title": "Senior Product Marketing Manager", "startDate": "2019-03", "endDate": "2022-01" },
{ "company": "Salesforce", "title": "Product Marketing Manager", "startDate": "2016-07", "endDate": "2019-02" }
],
"education": [
{ "school": "University of California Berkeley", "degree": "B.S.", "field": "Marketing" }
],
"skills": ["go-to-market strategy", "product launches", "demand generation", "salesforce", "hubspot"],
"pdlId": "qEnOZ98lib6NRxLIWAk0Vg_0000",
"matchConfidence": 8,
"enrichedAt": "2026-03-23T14:22:11.403Z",
"source": "pdl_enrich"
}

Example decision output (Decision Layer + confidence narrative)

The fields below are added to every record by the Decision Layer + confidence narrative — branch your downstream automation on recommendedAction, paste actionReason directly into Slack messages, and surface confidenceReason + confidenceRisks[] in CRM tasks:

{
"fullName": "Sarah Chen",
"jobTitle": "Head of Product Marketing",
"companyName": "Acme Corp",
"email": "s.chen@acmecorp.com",
"confidenceScore": 83,
"confidenceLevel": "high",
"confidenceReason": "Strong match (83/100): exact PDL match (likelihood 8/10) + work email with valid MX + rich input identifiers.",
"confidenceRisks": [],
"isContactable": true,
"recommendedAction": "SEND_TO_OUTREACH",
"actionPriority": "high",
"actionReason": "High composite confidence (83/100) with a deliverable, person-targeted work email — safe to add to an outreach sequence immediately.",
"identityStrength": 80,
"identitySignals": ["email", "name+domain"],
"notFoundAnalysis": null
}

And a not_found record carrying the why-not-found analysis instead:

{
"fullName": null,
"inputName": "John Smith",
"inputCompany": "Acme",
"source": "not_found",
"failureType": "pdl-not-found",
"recommendedAction": "RESEARCH_MANUALLY",
"actionPriority": "medium",
"actionReason": "PDL has no record for this person. Try Email Pattern Finder for a probable email pattern from the company domain, or research the person on LinkedIn directly.",
"notFoundAnalysis": {
"likelyCause": "name_common",
"causeConfidence": 0.7,
"suggestion": "\"john\" is a very common name — add the company domain (e.g. \"company.com\") to narrow the match, or supply the LinkedIn URL."
}
}

Output fields

FieldTypeDescription
inputNamestring|nullName from input (for join-back)
inputEmailstring|nullEmail from input (for join-back)
inputCompanystring|nullCompany from input (for join-back)
inputDomainstring|nullDomain from input (for join-back)
fullNamestring|nullPDL canonical full name
firstNamestring|nullFirst name
lastNamestring|nullLast name
emailstring|nullWork/professional email
personalEmailstring|nullPersonal email (if PDL has it)
phonestring|nullPrimary phone number
mobilePhonestring|nullMobile phone if separately listed
jobTitlestring|nullCurrent job title
jobTitleRolestring|nullBroad role category (e.g. engineering, sales)
senioritystring|nullNormalised seniority: c_suite, vp, director, manager, senior, entry
departmentstring|nullDepartment/sub-role (e.g. product marketing, software)
companyNamestring|nullCurrent employer name
companyDomainstring|nullCurrent employer website domain
companyIndustrystring|nullIndustry classification
companySizestring|nullEmployee count band (e.g. 501-1000)
companyLinkedinUrlstring|nullCompany LinkedIn page URL
linkedinUrlstring|nullPerson's LinkedIn profile URL
twitterUrlstring|nullTwitter/X profile URL
githubUrlstring|nullGitHub profile URL
facebookUrlstring|nullFacebook profile URL
locationstring|nullFull location string
citystring|nullCity
statestring|nullState or region
countrystring|nullCountry
workHistory[]array|nullPast jobs (enabled via includeWorkHistory)
workHistory[].companystringEmployer name
workHistory[].titlestringJob title held
workHistory[].startDatestring|nullStart date (YYYY-MM)
workHistory[].endDatestring|nullEnd date (YYYY-MM), null if current
education[]array|nullEducation records (enabled via includeEducation)
education[].schoolstringInstitution name
education[].degreestring|nullDegree type (e.g. B.S., MBA)
education[].fieldstring|nullField of study
skillsarray|nullSkills list (enabled via includeSkills)
pdlIdstring|nullPDL internal person ID (stable unique identifier)
matchConfidencenumber|nullPDL likelihood score 0–10 (10 = exact match)
enrichedAtstringISO 8601 timestamp of enrichment
sourcestringpdl_enrich, pdl_search, pdl_search_alias, or not_found
recordTypestringDiscriminator: person for enrichment results, error for run-level errors
emailMxValidboolean|nullTrue when work email's domain has at least one MX record (validateEmails: true only)
emailIsDisposableboolean|nullTrue when work email matches a known temp-mail / burner provider
emailIsRoleAccountboolean|nullTrue when local-part is a generic role (info@, sales@, support@) — bad cold-outreach target
confidenceScoreinteger|nullComposite 0–100 score (PDL likelihood + email signals + identifier richness)
confidenceLevelstring|nullhigh (≥75) / medium (≥50) / low band
confidenceBreakdownobject|nullPer-factor point breakdown (pdlLikelihoodPoints, emailPresentPoints, mxValidPoints, identifierRichnessPoints, disposablePenalty, roleAccountPenalty)
isContactablebooleanTrue when the record is reachable AND scores ≥50 AND not disposable
changeFlags[]array|nullCross-run change codes (when compareToPrevRun: true)
changeSinceLastRunobject|nullDiff against prior snapshot
changeSinceLastRun.previousJobTitlestring|nullTitle at last snapshot
changeSinceLastRun.previousCompanystring|nullCompany at last snapshot
changeSinceLastRun.previousSenioritystring|nullSeniority at last snapshot
changeSinceLastRun.previousEmailstring|nullWork email at last snapshot
changeSinceLastRun.seniorityDirectionstring|nullup / down / flat
changeSinceLastRun.daysSinceLastSeeninteger|nullDays between snapshots
failureTypestring|nullno-identifier / pdl-not-found / pdl-credits-exhausted / pdl-rate-limited / pdl-api-error / unexpected-error
recommendationstring|nullConcrete next step when enrichment failed
recommendedActionstringDecision Layer enum: SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP. Deterministic — same record always produces the same action
actionPrioritystringRouting priority: high / medium / low
actionReasonstringOne sentence explaining why this action was chosen — usable directly in CRM tasks, Slack messages, AI agent prompts
confidenceReasonstring|nullPlain-English summary of confidenceScore ("Strong match: exact PDL match (likelihood 9/10) + work email with valid MX")
confidenceRisks[]array|nullStable enum tags downstream automation can branch on (e.g. ["common_name", "mx_invalid"]). Empty array on records with no risks
identityStrengthinteger|null0–100 score reflecting INPUT identifier richness (vs confidenceScore which scores the MATCH)
identitySignals[]array|nullNames of identifier signals the input row provided (e.g. ["email", "linkedinUrl"])
notFoundAnalysisobject|nullWhy-Not-Found intelligence on not-found rows
notFoundAnalysis.likelyCausestringinsufficient_identifiers / company_mismatch / name_common / no_pdl_coverage / pdl_search_below_threshold
notFoundAnalysis.causeConfidencenumber0–1 score for how certain the cause classification is
notFoundAnalysis.suggestionstringConcrete next step the user can act on immediately
triggerEvents[]array|nullHigher-priority view of changeFlags — each event carries type, priority (high/medium/low), recommendedAction, and a Slack-ready summary string. Sorted high-priority first. Null when no monitor change fired
matchDebugobject|nullMatch transparency block (when includeMatchDebug or includeCandidates is true)
matchDebug.strategyPresetstringWhich strategy preset ran (strict / balanced / aggressive)
matchDebug.inputSignalsUsed[]arrayInput identifier names PDL was queried with (e.g. ["name", "company", "domain"])
matchDebug.candidatesConsideredintegerHow many candidate matches PDL Search returned (≥1 when matched)
matchDebug.selectedReasonstringPlain-English reason the winning candidate was selected
matchDebug.alternates[]arrayRejected candidates from PDL Search (when includeCandidates: true). Each: { fullName, jobTitle, companyName, companyDomain, location, matchConfidence, rejectedReason }. NOT charged, NOT separate dataset rows
duplicateGroupIdstring|nullStable group ID shared by all duplicate input rows (when dedupeMode != 'off'). Null when dedupe is off or the row was unique
isPrimaryRecordbooleanAlways true on the row that was actually enriched (the first member of each dedup group)
fromCachedSnapshotbooleanTrue when skipPreviouslyEnriched was on and this row was echoed from a fresh KV snapshot instead of a fresh PDL call. Cached rows are NOT charged

How much does it cost to enrich person data?

Person Enrichment Lookup uses pay-per-event pricing — you pay $0.15 per successfully enriched person. People who return no match (source: "not_found") are not charged. Platform compute costs are included.

ScenarioPeople processedMatch rate (typical)People enrichedTotal cost
Quick test1080%~8~$0.24
Small batch5080%~40~$1.20
Medium batch20080%~160~$4.80
Large batch50080%~400~$12.00
Enterprise batch1,00080%~800~$24.00

You can set a maximum spending limit per run to control costs. The actor stops cleanly when your budget is reached, and all already-enriched records are saved to the dataset before the run ends.

Compare this to Clay at $0.22–$5.63 per PDL enrichment depending on plan, or Apollo at $99–$249/month for a contact database. Most users running 500–1,000 enrichments per month spend $15–$25 with no subscription commitment. Apify's free tier also includes $5 of monthly credits, covering roughly 166 enrichments at no cost.

Person enrichment using the API

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/person-enrichment-lookup").call(run_input={
"persons": [
{"name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com"},
{"name": "James Okafor", "company": "Beta Industries"},
{"email": "priya.nair@vertexsolutions.com"},
],
"maxPersons": 100,
"includeWorkHistory": False,
"includeEducation": False,
"includeSkills": False,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
if item.get("source") != "not_found":
print(f"{item['fullName']} | {item['jobTitle']} @ {item['companyName']} | {item['email']} | confidence: {item['matchConfidence']}")

JavaScript

import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/person-enrichment-lookup").call({
persons: [
{ name: "Sarah Chen", company: "Acme Corp", domain: "acmecorp.com" },
{ name: "James Okafor", company: "Beta Industries" },
{ email: "priya.nair@vertexsolutions.com" },
],
maxPersons: 100,
includeWorkHistory: false,
includeEducation: false,
includeSkills: false,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
if (item.source !== "not_found") {
console.log(`${item.fullName} | ${item.jobTitle} @ ${item.companyName} | ${item.email} | confidence: ${item.matchConfidence}`);
}
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~person-enrichment-lookup/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"persons": [
{"name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com"},
{"email": "james.okafor@betaindustries.io"}
],
"maxPersons": 100
}'
# Fetch results (replace DATASET_ID from the run response above)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How to think about this actor

In B2B enrichment, data reliability depends on three factors: input quality, match confidence, and validation signals — this actor exposes all three explicitly. Reliable enrichment requires deterministic scoring, validation, and explainability — all built into this actor. Reliability comes from deterministic scoring — the same input always produces the same confidence score, with a full breakdown of why the match was selected.

This actor separates three concepts that other enrichment tools collapse into a single confusing number:

  • Identity strength — how good your INPUT data was (did you provide email, LinkedIn, name+domain, or just a name?). Surfaced as identityStrength (0–100) + identitySignals[].
  • Match confidence — how certain the enrichment RESULT is (PDL likelihood + email validity + identifier richness, with disposable / role-account penalties). Surfaced as confidenceScore (0–100) + confidenceLevel band + confidenceBreakdown per-factor audit.
  • Action decision — what the user should DO with this record (SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP). Surfaced as recommendedAction + actionPriority + actionReason.

These three are computed independently and combined into a single deterministic output per record. Same low score, different fix:

  • Low identity strength → fix your input list (add identifiers)
  • Low match confidence → PDL has thin coverage on this person (try Email Pattern Finder)
  • DROP / RESEARCH_MANUALLY decision → don't outreach without manual verification

The four phases below walk through how the data flows.

How Person Enrichment Lookup works

Phase 1 — Identifier assembly and PDL Enrich call

The actor builds a PDL Enrich API query from every available identifier on the person object. Email maps to email, LinkedIn URL to profile, name is split on whitespace into first_name + last_name, company maps to company, and domain maps to website. When no company name is present but a domain is, the domain is passed as both website and company to maximise PDL's fuzzy matching surface. The request requires at minimum one of: email, linkedinUrl, or name. The query always includes min_likelihood=2 and required=name to suppress spurious matches. A 30-second timeout is enforced via AbortController. HTTP 429 responses trigger one automatic retry after a 2-second pause before the request is marked as failed.

Phase 2 — PDL Search fallback

When the Enrich endpoint returns HTTP 404 (no match in PDL's direct lookup index), and the input includes both a name and a company or domain, the actor constructs an Elasticsearch DSL query against PDL's Person Search endpoint. The query uses bool.must clauses matching first_name, last_name, and either job_company_name or job_company_website (using a term filter for domain, a match clause for company name). Only the top-ranked result (size: 1) is returned. This fallback path catches people whose PDL record cannot be hit by exact identifier match but who are findable by name-and-employer combination.

Phase 3 — Data transformation and normalisation

The raw PDL profile goes through a set of deterministic transforms before being written to the dataset. Seniority is normalised from job_title_levels[] using a fixed priority ladder (c_suite takes precedence over vp, which takes precedence over director, and so on). Emails are typed: addresses with type=work or type=professional become email; addresses with type=personal become personalEmail; all other typed emails fall back to the work slot. Work history entries are mapped from PDL's experience[] structure, filtering out entries with neither company nor title. Education entries filter to records with a school name. The pdlId field carries PDL's stable internal person identifier, useful for deduplication across multiple enrichment runs.

Phase 4 — Email validation and composite confidence

When validateEmails: true, every returned work email is run through three deterministic checks: (a) DNS MX lookup with a 5-second timeout to confirm the domain has a mail server, (b) a bundled list of ~100 known disposable / temp-mail providers (mailinator, guerrillamail, 10minutemail families, etc.) to flag burner addresses, and (c) a 50-prefix table of generic role accounts (info@, sales@, support@, noreply@) to flag bad cold-outreach targets. MX results are cached per domain across the run so a batch of 50 contacts at the same company resolves the domain once. The composite confidenceScore (0–100) then combines PDL's raw likelihood (60% weight), email-present (10%), MX-valid (15%), identifier richness (5%), minus a 30-point disposable penalty and a 10-point role-account penalty. The confidenceLevel band (high ≥75 / medium ≥50 / low <50) and isContactable boolean (reachable AND score ≥50 AND not disposable) make spreadsheet filtering one-click.

Phase 5 — Cross-run change detection (optional)

When compareToPrevRun: true, the actor opens a named Apify Key-Value store (default name person-enrichment-monitor, override via monitorStateKey) and reads the prior run's snapshot map. After enriching each person, it builds a stable identity key (pdlId if available, else name@companyDomain, else name|companyName, else email) and looks up the prior snapshot. The diff function compares jobTitle, companyName, seniority, email, companyDomain, city, country, and linkedinUrl, then emits a stable changeFlags[] enum: NEW_PERSON (no prior), JOB_CHANGED (company differs), TITLE_CHANGED (title differs at same company), PROMOTION / DEMOTION (seniority moved up or down), EMAIL_CHANGED / EMAIL_GAINED / EMAIL_LOST, COMPANY_DOMAIN_CHANGED, LOCATION_CHANGED, NEW_LINKEDIN, UNCHANGED. Each record also carries a changeSinceLastRun block with the prior values, the seniority direction (up / down / flat), and the days between snapshots. The full snapshot map is persisted back to KV at the end of the run so the next scheduled execution picks up where this one left off.

Phase 6 — Filtering, charging, and output

Filters apply BEFORE pushData and BEFORE PPE charging — changeFlagsFilter (e.g. ["JOB_CHANGED", "PROMOTION"]) keeps only records where something interesting changed; minConfidenceScore keeps only records above a quality floor. Filtered records are not saved and not charged. PPE charging fires only after a kept record is pushed to the dataset and confirmed as source !== "not_found" — ensuring you never pay for a charge where the data was not saved. The actor monitors chargeResult.eventChargeLimitReached after every charge event and stops cleanly if your spending limit is reached. A 5-consecutive-failure circuit breaker stops the run if PDL is throwing repeated errors, preserving partial results. The run summary (matchRatePct, sourceCounts, failureCounts, confidenceCounts, changeFlagCounts, ppeChargesUsd) is written to the SUMMARY and OUTPUT keys in the default Key-Value store for orchestrators and dashboards to read; the dataset itself stays clean of summary noise so spreadsheet exports look uniform.

What this actor does NOT do

Honest scope so you don't pay for a tool that won't fix your problem:

  • Not a multi-source waterfall. This is a single-source PDL enricher with smart fallbacks (Search → Search-with-aliases). For Apollo + Cognism + Hunter + Datagma cascading, use Waterfall Contact Enrichment.
  • Not a LinkedIn scraper. No HTML scraping of linkedin.com — that violates LinkedIn's Terms of Service. The actor only queries People Data Labs' API, which aggregates from public professional sources.
  • Not an SMTP verifier. The emailMxValid flag confirms the domain has a mail server, but does NOT send a test email or run RCPT TO probes. For full SMTP-level verification (catch-all detection, mailbox existence), pipe results through Bulk Email Verifier.
  • Not an email-pattern guesser. When PDL has no record (failureType: "pdl-not-found"), this actor doesn't fabricate a probable email from the domain. For pattern fallback, chain the not_found rows into Email Pattern Finder.
  • Not a real-time sales platform. This is a per-batch enrichment tool, not a live data feed. PDL refreshes its database on a rolling basis with profile lag of 2–6 months. For real-time intent signals (hiring, funding, news), use Intent Signal Tracker.
  • Not a tech-stack detector. Company industry and size come from PDL; the tools a company uses do not. Combine with Decision-Grade Website Intelligence for technographics.
  • Not a buying-committee mapper. This actor enriches one person at a time. For grouping multiple contacts at the same company by buying-committee role (decision-makers, champions, blockers), use Website Contact Scraper which emits a buyingCommittee block.
  • Not an Apollo / ZoomInfo replacement at the platform level. Those are full sales-intelligence platforms with prospecting search, intent signals, lists, and CRM sync. This actor is the per-person enrichment primitive — best chained into your own pipeline, not used as a CRM front-end.

Run summary in the Key-Value store

Every run writes a machine-readable summary to the default Key-Value store under both the SUMMARY and OUTPUT keys. Inspect via the Apify Console "Storage" tab or fetch via API:

$curl "https://api.apify.com/v2/key-value-stores/<storeId>/records/SUMMARY"

Shape:

{
"totalInput": 100,
"totalUniqueAfterDedup": 96,
"totalProcessed": 96,
"enrichedCount": 78,
"notFoundCount": 18,
"pushedCount": 96,
"chargedCount": 78,
"filteredOut": 0,
"skippedAsCached": 12,
"duplicatesCollapsed": 4,
"matchRatePct": 81,
"ppeChargesUsd": 11.70,
"spendingLimitReached": false,
"creditsExhausted": false,
"circuitBroken": false,
"usingBuiltInKey": false,
"matchStrategy": "balanced",
"strategyDescription": "Default mode — PDL likelihood >=2, full fallback chain (Enrich -> Search -> Alias-Search), no automatic confidence floor. Best general-purpose mode.",
"requireFields": [],
"sourceCounts": { "pdl_enrich": 67, "pdl_search": 9, "pdl_search_alias": 2, "not_found": 18 },
"failureCounts": { "pdl-not-found": 18 },
"confidenceCounts": { "high": 60, "medium": 18, "low": 18 },
"changeFlagCounts": { "JOB_CHANGED": 4, "PROMOTION": 2, "UNCHANGED": 72, "NEW_PERSON": 0 },
"actionDistribution": {
"SEND_TO_OUTREACH": 52,
"VERIFY_EMAIL": 18,
"RESEARCH_MANUALLY": 18,
"DROP": 8
},
"segments": [
{ "name": "decision_makers", "label": "Decision makers", "filter": "seniority IN (c_suite, vp, director)", "count": 49, "memberKeys": ["sarah.chen@acmecorp.com", "..."] },
{ "name": "contactable_high_confidence", "label": "Contactable + high confidence", "filter": "isContactable = true AND confidenceLevel = \"high\"", "count": 52, "memberKeys": ["..."] },
{ "name": "promotion_triggers", "label": "Promotion triggers (since last run)", "filter": "changeFlags contains PROMOTION", "count": 2, "memberKeys": ["..."] },
{ "name": "new_job_movers", "label": "New job movers (since last run)", "filter": "changeFlags contains JOB_CHANGED", "count": 4, "memberKeys": ["..."] },
{ "name": "requires_verification", "label": "Requires email verification before outreach", "filter": "recommendedAction = VERIFY_EMAIL", "count": 18, "memberKeys": ["..."] },
{ "name": "low_quality_drop_candidates", "label": "Low quality / drop candidates", "filter": "recommendedAction = DROP", "count": 8, "memberKeys": ["..."] }
],
"cohortInsights": {
"topIndustries": [
{ "industry": "software / saas", "count": 31, "pct": 40 },
{ "industry": "fintech", "count": 14, "pct": 18 },
{ "industry": "healthcare", "count": 9, "pct": 12 }
],
"topCompanySizes": [
{ "size": "1001-5000", "count": 22, "pct": 28 },
{ "size": "501-1000", "count": 18, "pct": 23 },
{ "size": "11-50", "count": 12, "pct": 15 }
],
"topCountries": [
{ "country": "United States", "count": 52, "pct": 67 },
{ "country": "United Kingdom", "count": 11, "pct": 14 },
{ "country": "Germany", "count": 6, "pct": 8 }
],
"seniorityDistribution": { "vp": 24, "director": 19, "manager": 17, "senior": 12, "c_suite": 6 },
"contactableRatePct": 72,
"avgConfidenceScore": 81,
"duplicateRatePct": 4
},
"mxLookupsCached": 41,
"completedAt": "2026-05-01T14:22:11.403Z"
}

This is what dashboards, orchestrators, and AI agents should read — the dataset stays clean for spreadsheet exports.

Tips for best results

  1. Provide email when you have it. Email is PDL's highest-confidence identifier. A lookup by email alone typically returns matchConfidence of 8–10, versus 5–7 for name + company alone.

  2. Include the company domain alongside the company name. The actor passes domain as both website and a fallback company parameter. This catches PDL records where the company name is stored differently than you have it (e.g. "Acme" vs "Acme Corp" vs "Acme Corporation").

  3. Filter output by matchConfidence before importing to CRM. Scores of 6 and above are generally reliable. Scores of 2–4 from the Search fallback path should be manually verified before sending outreach.

  4. Use source to understand your match rate. After a run, count records by source value: pdl_enrich (best), pdl_search (good, fallback), not_found (no match). A high not_found rate often means input names need cleaning or the people are not in PDL's database (common for very small companies).

  5. For large batches, set maxPersons to a safe cap. If you have 5,000 names but are unsure of match rate, run 100 first to gauge quality and match rate before committing to the full list.

  6. Combine with Email Pattern Finder for un-enrichable contacts. When PDL returns not_found, you can still derive a probable work email by running the domain through Email Pattern Finder to get the company's email naming convention (e.g. {first}.{last}@domain.com).

  7. Schedule weekly runs to catch job changes. People change roles every 18–24 months on average. A weekly enrichment run against your active CRM contacts will surface companyName or jobTitle changes that indicate a contact has moved on.

  8. Bring your own PDL API key for enterprise volume. The built-in key is shared and subject to PDL's free-tier limits. A PDL paid account starts at $98/month and provides significantly higher call limits if you are running thousands of lookups daily.

  9. Use validateEmails: true when the output feeds an outreach tool. The DNS MX check is fast (cached per domain) and the disposable + role-account flags prevent paying enrichment fees on emails that will bounce or hit a generic inbox. Sort by confidenceScore descending and filter by isContactable === true for a clean outreach list.

  10. Schedule weekly with compareToPrevRun for job-change alerts. Set the schedule, point it at your CRM contact list, and add changeFlagsFilter: ["JOB_CHANGED", "PROMOTION"]. Only records where someone actually moved or got promoted are saved (and charged). Pipe to Slack via Apify's Slack integration for a "who moved this week" digest.

  11. Use a different monitorStateKey per CRM segment. The default state key is person-enrichment-monitor. If you're running monitors for multiple campaigns or segments (e.g. enterprise vs SMB, or different sales territories), set distinct monitorStateKey values per run so each segment keeps its own snapshot history.

Typical pipelines

Most teams chain this actor into one of three workflows. Each step's output flows into the next via Apify webhooks, Make / Zapier, or direct dataset reads.

Sales prospecting pipeline (cold outreach):

  1. Scrape contacts from company websites with Website Contact Scraper
  2. Enrich every contact with this actor (validateEmails: true, requireFields: ["email", "jobTitle"])
  3. Verify deliverability with Bulk Email Verifier on recommendedAction = VERIFY_EMAIL rows
  4. Push recommendedAction = SEND_TO_OUTREACH rows into your sequencing tool (HubSpot / Outreach / Apollo / Lemlist) via HubSpot Lead Pusher or a Zapier webhook

CRM hygiene + job-change monitoring (scheduled weekly):

  1. Export your active CRM contact list to Apify
  2. Run this actor with compareToPrevRun: true, changeFlagsFilter: ["JOB_CHANGED", "PROMOTION", "EMAIL_CHANGED"], skipPreviouslyEnriched: true
  3. Pipe the filtered output to Slack for a "who moved this week" digest, or push back to your CRM to update stale records

Recruiting / talent sourcing:

  1. Build a candidate shortlist (LinkedIn Sales Navigator export, GitHub repo contributors, conference attendee list)
  2. Run this actor with includeWorkHistory: true, includeEducation: true, includeSkills: true, matchStrategy: 'aggressive'
  3. Read the decision_makers and contactable_high_confidence segments from the SUMMARY KV value to prioritise outreach

Combine with other Apify actors

ActorHow to combine
Website Contact ScraperScrape a company website to find employee names and partial emails, then enrich each person here for complete profiles
Google Maps Email ExtractorExtract business owner contacts from Google Maps, then enrich each contact with job title and seniority via this actor
Email Pattern FinderFor contacts this actor returns as not_found, use Email Pattern Finder to derive a probable email from the company domain
Bulk Email VerifierVerify the email field returned by enrichment before sending outreach — confirms MX record and SMTP deliverability
B2B Lead QualifierPipe enriched records into the qualifier to score each contact 0–100 from seniority, company size, industry, and 27 other signals
HubSpot Lead PusherWrite enriched contacts directly into HubSpot with mapped field names
Waterfall Contact EnrichmentUse as the PDL step in a 10-source waterfall cascade to maximise total enrichment coverage
Lead Enrichment PipelineAll-in-one Clay alternative: email discovery, verification, company research, and scoring in one run ($0.12/lead)
AI Outreach PersonalizerGenerate personalized cold emails using your own OpenAI/Anthropic key — zero AI markup ($0.01/lead)
Intent Signal TrackerTrack buying signals: hiring, tech changes, funding, content updates. Prioritize outreach by intent score ($0.05/company)
Lead Data Quality AuditorAudit lead data quality before outreach — email verification, phone validation, domain freshness ($0.005/record)

Limitations

  • PDL database coverage — PDL covers approximately 3 billion professional profiles, but match rates vary by region and company size. Expect 70–85% match rates for US/UK professionals at companies with 50+ employees. Rates drop to 40–60% for freelancers, very small businesses, and professionals in markets where LinkedIn adoption is low.
  • No JavaScript rendering — this actor makes direct HTTP calls to the PDL API and does not scrape websites. If you need to extract contact details from JavaScript-rendered company pages before enriching, use Website Contact Scraper Pro first.
  • Data freshness — PDL refreshes its database on a rolling basis, but profiles may lag behind real-world job changes by 2–6 months. matchConfidence does not reflect recency, only identity match certainty.
  • 1,000 persons per run maximum — the maxPersons cap is enforced at 1,000. For larger lists, split into multiple runs or use the API to queue batches.
  • Built-in API key limit — the shared built-in PDL key covers a limited number of lookups per month across all users of the actor. For reliable high-volume use, bring your own PDL API key.
  • No bulk-lookup discount — PDL charges per API call regardless of batch size on the calling side. This actor's pricing reflects per-enriched-person charges.
  • Seniority for non-English titles — PDL's job_title_levels normalisation is less reliable for job titles in non-English languages. The seniority field may be null for some international profiles.
  • Phone number availability — PDL has phone data for roughly 20–30% of profiles. The phone and mobilePhone fields will be null for the majority of enriched contacts.

Integrations

  • Zapier — trigger enrichment runs from a Zap when new contacts are added to a spreadsheet or CRM, then route enriched records to any Zapier-connected app
  • Make — build enrichment automations that run on a schedule or in response to form submissions, webhooks, or CRM events
  • Google Sheets — export enriched contact lists directly to a Sheet for team review or import into marketing tools
  • Apify API — trigger runs programmatically from your own backend, CRM integration scripts, or data pipeline
  • Webhooks — post enriched dataset results to your CRM or data warehouse endpoint immediately when a run completes
  • LangChain / LlamaIndex — use enriched person data as structured context for AI sales agents, automated outreach writers, or research assistants

Troubleshooting

  • High not_found rate despite correct names — verify your company names match how PDL stores them. "Meta" versus "Facebook" or "Alphabet" versus "Google" can cause misses on the Search fallback path. Try adding the domain (domain: "meta.com") alongside the company name to give PDL a domain-based match path.

  • matchConfidence is low (2–4) on returned records — low scores typically come from the PDL Search fallback path. These are fuzzy matches on name + employer. Always filter output by matchConfidence >= 5 before importing to CRM or sending outreach. Low-confidence records should be manually verified.

  • Run stops early with "PDL API credits exhausted" in the status — this means the built-in API key has hit its monthly limit. Either wait for the calendar month to reset, or add your own PDL API key in the pdlApiKey field. PDL's free tier provides 100 API calls/month per account at peopledatalabs.com.

  • Run stops with "Spending limit reached" — your Apify run budget cap was hit. This is by design. Increase your spending limit in the actor run settings, or process a smaller batch. All records enriched before the limit was reached are saved in the dataset.

  • Email field is null but personalEmail has a value — PDL has a personal email on file but no verified work email. You can use Email Pattern Finder on the person's company domain to derive the likely work email format.

Responsible use

  • This actor only queries the People Data Labs API using data you provide. It does not scrape any websites or access private systems.
  • PDL's data is aggregated from public professional sources including LinkedIn, company websites, and professional directories.
  • Comply with GDPR, CAN-SPAM, CASL, and other applicable data protection laws when using enriched contact data for outreach.
  • Do not use enriched data for spam, harassment, profiling without lawful basis, or any purpose that violates the terms of service of the platforms from which the underlying data originates.
  • For guidance on the legal framework around B2B contact data, see Apify's guide on web scraping legality.

FAQ

How many people can I enrich in one run with Person Enrichment Lookup? Up to 1,000 persons per run. Set maxPersons in your input to control the limit. For lists larger than 1,000, split into multiple runs or use the Apify API to queue batches sequentially.

What identifiers does person enrichment require? Each person needs at minimum one of: email address, LinkedIn URL, or full name. For name-only records, at least one of company or domain is required to enable the Search fallback. More identifiers produce higher matchConfidence scores.

How accurate is person enrichment from PDL? PDL's direct Enrich endpoint (email or LinkedIn-based lookups) consistently returns matchConfidence of 8–10. Name + company searches via the fallback path return 5–7. Match rates vary by region: 75–85% for US/UK enterprise contacts, 40–65% for other markets. The matchConfidence score on every record tells you how much to trust each match.

How long does a typical person enrichment run take? The actor enforces a 200ms delay between PDL API calls to respect rate limits. A batch of 100 people completes in approximately 30 seconds. A batch of 1,000 completes in 4–8 minutes depending on how many Search fallback calls are needed.

Does person enrichment work for people outside the United States? Yes, PDL has global coverage, but match rates are higher for professionals in markets with strong LinkedIn adoption (US, UK, Canada, Australia, Western Europe). Match rates are lower for professionals in markets where LinkedIn is less dominant.

How is Person Enrichment Lookup different from Clay? Both use the People Data Labs API as a data source. Clay charges $0.22–$5.63 per PDL enrichment depending on your subscription tier, plus a monthly platform fee. This actor charges $0.15 per successfully enriched person with no subscription. You only pay for matches, not misses. The actor also provides a Search fallback path that Clay's PDL action does not expose, improving match rates for name + company inputs.

How is this different from Apollo or ZoomInfo? Apollo ($99–$249/month) and ZoomInfo ($15,000+/year) are full sales intelligence platforms. This actor is a single-purpose enrichment tool that uses PDL's independent B2B database. It costs less per lookup, requires no subscription, and outputs clean structured JSON that maps directly into any CRM or data pipeline. It does not replace the prospecting search features of Apollo or ZoomInfo.

Can I use my own People Data Labs API key? Yes. Add your PDL API key to the pdlApiKey field. The built-in key is shared and subject to PDL's free-tier rate limits. Bringing your own key removes that ceiling. PDL's free tier provides 100 API calls/month; paid plans start at $98/month.

Is person enrichment legal? PDL aggregates data from public professional sources. Querying PDL's API to enrich business contact data for legitimate B2B purposes is generally lawful. You must comply with applicable data protection regulations (GDPR, CAN-SPAM, CASL) in how you use the enriched data. For detailed guidance see Apify's legal guide.

What happens if a person is not found in PDL? The actor records a result row with source: "not_found" and null values for all enriched fields. You are not charged for not-found records. The row is still written to the dataset so your row count matches your input list, making join-back straightforward.

Can I schedule person enrichment to run automatically? Yes. Use Apify's built-in scheduler to run enrichment on a daily, weekly, or custom cron schedule. This is useful for refreshing CRM contact data on a recurring basis to catch job changes.

Can I combine person enrichment with other actors in an automated pipeline? Yes. The most common pipeline is: scrape company websites with Website Contact Scraper → enrich contacts here → verify emails with Bulk Email Verifier → push to HubSpot with HubSpot Lead Pusher. You can connect these steps via Apify webhooks or Make/Zapier automations.

How does the composite confidence score work? The confidenceScore (0–100) is a weighted combination of: PDL likelihood (60% weight, mapped from 0–10 to 0–60 points) + email present (10 points) + MX valid (15 points) + identifier richness (up to 5 points based on how many distinct identifiers the input had) − disposable email penalty (30 points) − role-account penalty (10 points). The confidenceLevel band is derived from the score: high ≥75, medium ≥50, low <50. The confidenceBreakdown object on every record shows exactly which factors contributed how many points so you can audit the score yourself. Use minConfidenceScore: 50 to filter out junk records before saving.

How does cross-run change detection work? Set compareToPrevRun: true and the actor opens a named Apify Key-Value store (default person-enrichment-monitor). It loads the prior run's snapshot map, enriches each person, looks them up by stable identity key (PDL ID first, falling back to name+domain), diffs the new record against the prior snapshot, and emits a changeFlags[] array with stable enum values: JOB_CHANGED, TITLE_CHANGED, PROMOTION, DEMOTION, EMAIL_CHANGED, LOCATION_CHANGED, NEW_PERSON, UNCHANGED, etc. The full updated snapshot map is persisted back to KV at the end of the run. First run on a new state key produces NEW_PERSON flags on every record (no prior to compare against). Schedule weekly to convert the actor from one-shot enrichment into a job-change monitor.

What happens when the actor encounters an alias-rebranded company like Facebook? With useCompanyAliases: true (default on), if PDL Search returns 0 hits for the literal company name (e.g. "Facebook"), the actor automatically retries with each known alias for that company ("Meta Platforms", "Meta", "Facebook Inc"). When an alias hits, the record's source field is set to pdl_search_alias so you can filter on aliased matches if needed. Roughly 70 alias groups are baked in, covering most Fortune 500 rebrands (Alphabet↔Google, X↔Twitter, Block↔Square, HPE / HP / Hewlett-Packard, etc.).

How do I avoid being charged for low-confidence or unchanged records? Both filters apply BEFORE pushData and BEFORE PPE charging, so filtered records cost nothing. For confidence: set minConfidenceScore: 50 (or higher for cold outreach). For change-only monitoring: set compareToPrevRun: true plus changeFlagsFilter: ["JOB_CHANGED", "PROMOTION", "EMAIL_CHANGED"]. Records that fail the filter are dropped from the dataset entirely and never trigger a PPE charge — perfect for weekly scheduled runs where most contacts haven't moved.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.