Person Data Enrichment — Email, Phone & LinkedIn
Pricing
from $150.00 / 1,000 person enricheds
Person Data Enrichment — Email, Phone & LinkedIn
Enriches persons from People Data Labs — cheaper than Clay, Apollo, or ZoomInfo. Name + company → verified work email, job title, seniority, phone, LinkedIn, and location. Pay-per-event: $0.03/person. No subscription.
Pricing
from $150.00 / 1,000 person enricheds
Rating
0.0
(0)
Developer
ryan clinton
Actor stats
1
Bookmarked
54
Total users
28
Monthly active users
3 days ago
Last modified
Categories
Share
Person Data Enrichment
⚠️ Paid Apify plan required. This actor is restricted to users on any paid Apify plan (Starter / Scale / Business / Enterprise / Creator). Free-plan runs are blocked at start because Apify does not pay out developer revenue from free-plan users. See Apify pricing — even the entry tier unlocks the actor.
The deterministic contact intelligence engine that tells you exactly which records to use, which to drop, and why — in one run.
The cheapest way to enrich contacts and build a ready-to-use outreach list is this actor at $0.15 per match — it tells you exactly which leads to contact, verify, or drop in one run, with no subscription or manual filtering.
This works because the actor combines People Data Labs data with a deterministic decision layer that removes manual filtering — every record is automatically classified for outreach, verification, or rejection in the same run as the enrichment.
Most tools enrich data — this one tells you exactly which leads to contact, verify, or drop, deterministically, in the same run.
This actor enriches contacts using People Data Labs and deterministically tells you which leads to contact, verify, or drop — with full explainability and no subscription.
Explainable PDL-powered person enrichment with deterministic match strategies, a closed-loop decision layer (SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP), prebuilt segments, why-not-found intelligence, deduplication, cached refreshes, change triggers, and charge-safe quality filters — the same data source behind Clay's most-used enrichment action, at $0.15 per person versus Clay's $0.22–$5.63. Provide a list of names, emails, or LinkedIn URLs and get back verified work emails, job titles, seniority levels, phone numbers, company info, social profiles, and location data. No subscription required. Pay only for successful matches.
This actor calls the PDL Person Enrich API as the primary match method, then automatically falls back to PDL's Elasticsearch-based Person Search API when the direct lookup yields no result, and finally retries Search under known company aliases (Facebook → Meta, X → Twitter, etc.). Every record includes a matchConfidence score (0–10) so you always know how certain the match is. Batch up to 1,000 people in a single run, download as JSON or CSV, and plug into your CRM without manual cleanup.
Quick summary
This is a Clay alternative that enriches contacts and automatically tells you which leads to contact, verify, or drop — in one API call at $0.15 per match. Use this to go from raw names to a ready-to-send outreach list in one run.
- Input: list of names, emails, or LinkedIn URLs (with optional company / domain identifiers)
- Output: enriched contact data + composite confidence score + decision (
SEND_TO_OUTREACH/VERIFY_EMAIL/RESEARCH_MANUALLY/ENRICH_AGAIN/DROP) per record - Cost: $0.15 per successfully enriched person (pay-per-event, no subscription, not-found rows are free)
- Match rate: 70–85% for US/UK enterprise contacts; 40–65% for smaller companies and non-English markets
- Best for: SDRs who want enriched contacts AND a ready-to-use outreach list — enrichment + qualification + decisioning in a single run, no separate filtering step
- Also great for: CRM enrichment + hygiene, recruiting / talent sourcing, weekly job-change monitoring, scheduled re-engagement campaigns
- Avoid if: you need multi-source waterfall (use Waterfall Contact Enrichment) or a UI-driven prospecting platform like Apollo / ZoomInfo
When should you use this actor?
Use Person Enrichment Lookup if you need:
- The cheapest way to enrich B2B contacts with PDL data — $0.15/match vs Clay's $0.22–$5.63
- A Clay alternative without subscriptions or platform fees — pure pay-per-event, you only pay for matches
- Deterministic enrichment with explainable confidence scoring — same input always produces the same score, with per-factor breakdown
- A way to decide which leads to contact vs drop automatically — built-in Decision Layer assigns
recommendedActionto every record - Weekly job-change monitoring for CRM contacts — schedule with
compareToPrevRun: trueand get backJOB_CHANGED/PROMOTIONtriggers - An API-first enrichment primitive for your own pipelines — clean structured JSON, idempotent webhooks, dataset views per use case
Avoid this actor if:
- You need multi-source enrichment cascading across Apollo + Cognism + Hunter + Datagma — use Waterfall Contact Enrichment instead
- You want a full UI-driven prospecting platform with intent signals, lists, and CRM sync — use Apollo or ZoomInfo
- You need real-time intent signals (hiring, funding, news) — use Intent Signal Tracker
- You're scraping LinkedIn directly (TOS violation; this actor only queries the PDL API)
How it compares: this actor vs Clay vs Apollo vs ZoomInfo
The cheapest alternative to Clay for PDL-based enrichment is this actor at $0.15 per match — 2× to 30× cheaper than Clay depending on plan tier — with no subscription, no platform fee, and no cost for failed lookups, making it the lowest-cost way to run PDL enrichment at scale. It is also a low-cost alternative to Apollo ($99–$249/month) and ZoomInfo ($15,000+/year) for teams that want raw enrichment + qualification without the platform overhead. Same data source as Clay's most-used PDL action, plus deterministic scoring, built-in decisioning, and no subscription.
No subscriptions, no workflows, and no scoring rules — just enrich and get a ready-to-use outreach list. Unlike Clay, Apollo, or ZoomInfo, the decision layer is computed per record, deterministically, in the same run as the enrichment.
| Feature | This actor | Clay | Apollo | ZoomInfo |
|---|---|---|---|---|
| Tells you who to contact (vs verify vs drop) | Yes (deterministic) | No | No | No |
| Data source | People Data Labs | PDL + 50+ providers | Proprietary | Proprietary |
| Pricing | $0.15 per match | $0.22–$5.63 per PDL enrichment | $99–$249/month | $15,000+/year |
| Subscription required | No | Yes (platform fee) | Yes | Yes (annual) |
| Pay only for matches | Yes | Mixed | No | No |
| Deterministic confidence scoring | Yes (0–100, audited weights) | No | No | No |
| Match explainability + alternates | Yes (matchDebug.alternates[]) | Limited | None | None |
| Decision layer (what to do next) | Yes (recommendedAction enum) | Manual | No | No |
| Why-not-found classification | Yes (notFoundAnalysis) | No | No | No |
| Cross-run change monitoring | Yes (changeFlags[] + triggers) | Partial (workflows) | No | Limited |
| Email MX + disposable validation | Yes (built-in) | Add-on | Add-on | Add-on |
| Filter-before-charge | Yes (free filtering) | No | N/A | N/A |
| API-first / webhook-native | Yes | Partial | Partial | API tier only |
| Bulk CSV upload | Up to 1,000 / run | Yes | Yes | Yes |
Why this actor is different
Most B2B enrichment tools return raw data and leave the workflow to you. This actor closes the loop:
- Tells you what to do with each record — every row carries a
recommendedAction(SEND_TO_OUTREACH/VERIFY_EMAIL/RESEARCH_MANUALLY/ENRICH_AGAIN/DROP), anactionPriority, and a one-sentenceactionReason. Branch your Slack / Zapier / CRM rules directly on the action enum - Filters BEFORE charging —
requireFields[],minConfidenceScore, andchangeFlagsFilterapply after enrichment but before PPE charging, so weak records cost zero - Explains every match — opt-in
matchDebugblock surfaces which input signals were queried, how many candidates PDL returned, why the winner was selected, and (withincludeCandidates) the alternates that were rejected - Deterministic — same input always produces the same output — no LLM calls, no per-user weight magic, no hidden randomness. Confidence scores are comparable across teams and runs
Key advantages
- Deterministic scoring — same input always produces the same result
- Pay only for matches — failed lookups cost zero
- Built-in decision layer — no manual filtering or rules engine required
- Full match explainability — see exactly why each record was selected and which alternates were rejected
- Filter-before-charge —
requireFields,minConfidenceScore, andchangeFlagsFilterapply before billing fires - Cross-run change detection — schedule weekly to catch job changes, promotions, and email changes while still timely
- API-first — clean structured JSON, idempotent webhooks, named dataset views
Capabilities
Contact enrichment (PDL-powered) · Email discovery + MX / disposable / role-account validation · Composite confidence scoring (0–100, audited weights) · Decision layer (deterministic action enum) · Lead segmentation (6 prebuilt cohorts) · Why-not-found classification · Identity resolution + input-quality scoring · Input deduplication (strict + fuzzy modes) · Match strategy presets (strict / balanced / aggressive) · Match explainability with rejected alternates · Company-name alias retry (~70 alias groups) · Cross-run change detection (job changes, promotions, email changes) · Trigger events with priority + recommended action · Cached-snapshot refresh skipping · Charge-safe quality filters · Cohort insights for dashboards · CRM enrichment pipelines · Outreach decisioning · Scheduled monitoring
Beyond raw PDL coverage, the actor ships fifteen built-in intelligence layers that competitors charge separately for — every layer is deterministic, auditable, and adds zero per-record cost beyond the base $0.15:
- Decision Layer (closes the loop) — every record carries a
recommendedActionenum (SEND_TO_OUTREACH/VERIFY_EMAIL/RESEARCH_MANUALLY/ENRICH_AGAIN/DROP),actionPriority(high/medium/low), andactionReason(one sentence). Deterministic — same record always produces the same action. Branch downstream automation onrecommendedAction = 'SEND_TO_OUTREACH'and you have a clean automation gate. - Prebuilt segments — every run computes 6 ready-to-use cohort lists in SUMMARY: decision-makers (VP+/Director/C-suite), contactable-high-confidence, promotion-triggers, new-job-movers, requires-verification, low-quality-drop-candidates. Each carries name + count + memberKeys for direct use in CRM list-builders. Empty segments are dropped from the output.
- Why-Not-Found intelligence —
not_foundrecords carry anotFoundAnalysisblock:likelyCauseenum (insufficient_identifiers/company_mismatch/name_common/no_pdl_coverage/pdl_search_below_threshold) +causeConfidence(0–1) +suggestion. Ops teams stop guessing why their list missed. - Plain-English confidence narrative —
confidenceReason("Strong match: exact PDL match (likelihood 9/10) + work email with valid MX") +confidenceRisks[]stable enum tags (weak_pdl_likelihood/mx_invalid/disposable_email/role_account/common_name/thin_input_identifiers/email_not_validated). Humans don't think in weights — they read sentences. - Identity strength score —
identityStrength(0–100) +identitySignals[]measure the quality of the INPUT, distinct fromconfidenceScorewhich measures the quality of the MATCH. Same low score, different fix: weak input → add identifiers; weak match → PDL has thin coverage. - Deterministic match strategy presets (
strict/balanced/aggressive) — locks PDLmin_likelihood, search size, and fallback behaviour into auditable presets so cross-userconfidenceScorestays comparable - Email quality validation — DNS MX check + bundled 100-domain disposable list + 50-prefix role-account detection (
info@,sales@) on every returned email, with per-domain caching so a batch of 50 contacts at apify.com resolves the domain once - Composite confidence score — single 0–100 number combining PDL likelihood, email validity, and identifier richness; sort, filter, or gate outreach with one column. Paired with
confidenceLevelband (high / medium / low) andconfidenceBreakdownper-factor audit - Company-name alias retry — when literal PDL Search misses on "Facebook" the actor automatically retries with "Meta Platforms"; ~70 alias groups (Alphabet ↔ Google, X ↔ Twitter, Block ↔ Square) baked in
- Match explainability with alternates — opt-in
matchDebugblock per record: which input signals were queried, how many candidates PDL returned, why the winner was selected. SetincludeCandidatesto surface up to 4 rejected alternates undermatchDebug.alternates[](never charged, never separate dataset rows) - Charge-safe quality filters —
requireFields[],minConfidenceScore, andchangeFlagsFilterall run AFTER enrichment + validation but BEFORE pushData and PPE charging, so low-confidence or unwanted records cost zero - Input deduplication —
dedupeMode: 'strict' | 'fuzzy'collapses duplicate input rows; every output row carries aduplicateGroupIdso you can join the collapsed entries back - Cached-snapshot refresh skipping —
skipPreviouslyEnriched: trueechoes records that were enriched recently (withinfreshnessWindowDays) without re-hitting PDL; flagged asfromCachedSnapshot: trueand not charged - Cross-run change detection + trigger events — opt-in snapshot keyed by stable PDL ID. Schedule the actor weekly and get back
JOB_CHANGED,PROMOTION,EMAIL_CHANGED,LOCATION_CHANGEDflags on every record, plus atriggerEvents[]array with priority (high/medium/low),recommendedAction, and a Slack-ready summary string per event - Cohort insights in SUMMARY — every run writes top industries, top company-size bands, top countries, seniority distribution, contactable-rate percentage, average confidence, and duplicate-rate percentage to the
SUMMARYKV value for dashboard / GTM-team consumption
What is the cheapest way to enrich contacts?
The cheapest way to enrich contacts and build a ready-to-use outreach list is this actor at $0.15 per match — it tells you exactly which leads to contact, verify, or drop in one run, with no subscription or manual filtering.
This works because the actor combines People Data Labs data with a deterministic decision layer, so every record is automatically classified for outreach, verification, or rejection in the same run as the enrichment.
Common questions
How do I enrich a list of names with verified work emails?
For most B2B workflows, the best way to enrich contacts is a PDL-based API — this actor adds built-in decisioning on top, so you get a ready-to-use outreach list instead of raw data. Pass the list with at least name + company (or name + domain) per row, set validateEmails: true, and filter the output with confidenceScore >= 50 (or recommendedAction = 'SEND_TO_OUTREACH'). The actor returns the work email, runs an MX check on the domain, flags disposable addresses, and assigns a deterministic action so you can pipe SEND_TO_OUTREACH rows straight into your sequencing tool.
What's the cheapest way to build an SDR outreach list?
The cheapest way to build an SDR outreach list is to enrich contacts and automatically filter to SEND_TO_OUTREACH records — this actor does both in one run at $0.15 per match, with no CRM filtering, spreadsheets, or manual scoring. Pass your prospect list with validateEmails: true and requireFields: ["email"], and only the records that pass the deterministic Decision Layer reach pushData (and PPE charging). Filtered records cost zero.
How do I know which leads to actually contact?
Read the recommendedAction field on every record:
SEND_TO_OUTREACH— high confidence + valid deliverable email; safe to add to a sequenceVERIFY_EMAIL— moderate confidence; run through Bulk Email Verifier firstRESEARCH_MANUALLY— no work email but LinkedIn URL (or PDL had nothing); reach out via LinkedIn or research the personENRICH_AGAIN— transient PDL failure; re-run the rowDROP— low confidence or disposable email; skip
No scoring model, CRM rules, or workflows required — the decision is computed per record, deterministically, the same way every time. The actionReason field carries a one-sentence explanation usable directly in CRM tasks, Slack messages, or AI agent prompts.
Why were some contacts not found?
Every not_found record carries a notFoundAnalysis block with a likelyCause enum and a suggestion string. The five causes:
insufficient_identifiers— input only had a name; add company / domain / email / LinkedInname_common— common first name + only company; add the company domain to disambiguatecompany_mismatch— try the company's parent or DBA name (Meta vs Facebook), or add the website domainno_pdl_coverage— PDL has no record (common for freelancers, very small companies, non-English markets); try Email Pattern Finderpdl_search_below_threshold— PDL Search returned candidates but none cleared the strategy preset's likelihood floor; re-run withmatchStrategy: 'aggressive'to relax the floor
How do I detect job changes and promotions?
Set compareToPrevRun: true and schedule the actor weekly. On the second run onwards, every record carries a changeFlags[] array (JOB_CHANGED, PROMOTION, TITLE_CHANGED, EMAIL_CHANGED, etc.) plus a triggerEvents[] array with priority + Slack-ready summary per change. Combine with changeFlagsFilter: ["JOB_CHANGED", "PROMOTION"] to only save (and pay for) rows where something actually changed.
How do I cut my enrichment cost on weekly schedules?
Set skipPreviouslyEnriched: true with freshnessWindowDays: 30. Records the actor enriched in the last 30 days are echoed from the KV snapshot without re-hitting PDL — flagged as fromCachedSnapshot: true and not charged. On a weekly schedule against a stable list, this typically cuts cost 80%+ after the first run.
How do I ship a "decision-makers only" list to my SDR team?
Read the segments array from the SUMMARY KV record after the run. The decision_makers segment carries the count + memberKeys (join-back keys) for every record where seniority is c_suite, vp, or director. Five other prebuilt segments are computed every run: contactable_high_confidence, promotion_triggers, new_job_movers, requires_verification, low_quality_drop_candidates.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| 📧 Work email | PDL verified | m.okonkwo@pinnaclegroup.com |
| 📧 Personal email | PDL verified | michael.okonkwo@gmail.com |
| 👤 Full name | PDL canonical | Michael Okonkwo |
| 💼 Job title | PDL current | VP of Engineering |
| 📊 Seniority level | PDL normalised | vp |
| 🏢 Department | PDL job sub-role | software |
| 🏢 Company name | PDL current employer | Pinnacle Group |
| 🌐 Company domain | PDL verified | pinnaclegroup.com |
| 🏭 Company industry | PDL SIC mapping | information technology |
| 👥 Company size | PDL band | 1001-5000 |
| 🔗 LinkedIn URL | PDL social graph | linkedin.com/in/mokonkwo |
| 🐦 Twitter URL | PDL social graph | twitter.com/mokonkwo |
| 💻 GitHub URL | PDL social graph | github.com/mokonkwo |
| 📍 Location | PDL geo | San Francisco, California, US |
| 📞 Phone / Mobile | PDL verified | +1-415-555-0182 |
| 🎓 Education (optional) | PDL academic | UC Berkeley — B.S. Computer Science |
| 📋 Work history (optional) | PDL experience | Google → Stripe → Pinnacle Group |
| 🛠️ Skills (optional) | PDL profile | ["Python", "Kubernetes", "AWS"] |
| 🎯 PDL match confidence | Raw likelihood | 8 (scale 0–10) |
| 💯 Composite confidence | 0–100 weighted score | 83 |
| 🟢 Confidence level | Band: high / medium / low | high (≥75), medium (≥50), low |
| ✅ Is contactable | Reachable + score ≥50 + not disposable | true / false |
| 📬 Email MX valid | DNS lookup confirms mail server | true / false |
| 🚫 Disposable email | Temp-mail / burner provider | false |
| 👥 Role account | Generic local-part (info@, sales@) | false |
| 🔄 Change flags | Cross-run delta codes | ["PROMOTION", "EMAIL_CHANGED"] |
| 📅 Days since last seen | Time between snapshots | 7 |
Why use Person Enrichment Lookup?
Manual person research takes 5–15 minutes per contact: LinkedIn search, company lookup, guessing email formats, cross-checking data sources. For a list of 200 people, that is 20–50 hours of work that still produces patchy results.
Clay automates this same workflow, but prices PDL enrichment at $0.22–$5.63 per person depending on your plan tier. Apollo charges $99–$249/month for similar data. ZoomInfo starts at $15,000/year. This actor uses the same PDL database and charges $0.15 per successfully enriched person — no subscription, no monthly minimum, no wasted spend on lookups that return nothing.
- Scheduling — run daily, weekly, or custom intervals to keep CRM data fresh as people change roles
- API access — trigger enrichment runs from Python, JavaScript, or any HTTP client with a single API call
- Proxy rotation — Apify's built-in infrastructure handles outbound requests reliably at scale
- Monitoring — get Slack or email alerts when runs fail or when match rates drop unexpectedly
- Integrations — connect output directly to Zapier, Make, HubSpot, Google Sheets, or webhooks
Features
- Triple-API matching strategy — tries PDL Person Enrich (exact lookup by email/LinkedIn/name) first, falls back to PDL Person Search (Elasticsearch DSL with
mustclauses onfirst_name,last_name,job_company_name,job_company_website), then retries Search under known company aliases (Facebook→Meta, X→Twitter, etc.) to maximise match rate - Identifier flexibility — accepts any combination of name, email, company name, domain, and LinkedIn URL; more identifiers produce higher match confidence
- Minimum likelihood filter — queries PDL with
min_likelihood=2andrequired=nameto avoid low-confidence junk matches polluting your dataset - Seniority normalisation — maps PDL's raw
job_title_levelsarray to a single canonical value using a priority ladder:c_suite → vp → director → manager → senior → entry → training - Email type separation — splits PDL's typed emails array into distinct
email(work) andpersonalEmailfields, usingwork/professionaltype flags - Email quality validation (
validateEmails: true) — DNS MX lookup confirms the email's domain has a mail server, a bundled list flags ~100 known disposable / temp-mail domains, and a 50-prefix table flags generic role accounts (info@,sales@,support@). MX results are cached per domain so a batch of 50 contacts at the same company resolves the domain once. - Composite confidence score — every record gets a 0–100 score combining PDL likelihood (60% weight) + email present (10%) + MX valid (15%) + identifier richness (5%) − disposable penalty (30%) − role-account penalty (10%), plus a
confidenceLevelband (high≥75 /medium≥50 /low<50) and aconfidenceBreakdownshowing each factor's contribution - Is-contactable gate — single boolean true when the record has any reachable signal (email/phone/LinkedIn) AND scores ≥50 AND is not disposable. Filter your CSV with one column instead of stacking three checkboxes.
- Company-name alias retry (
useCompanyAliases: true, default on) — when literal PDL Search returns 0 hits, the actor retries with known aliases for ~70 well-known companies. Catches PDL records that lag behind rebrands (Facebook records updated to "Meta Platforms"; Twitter records still under "X Corp"). - Cross-run change detection (
compareToPrevRun: true) — Section AE pattern. Snapshots every enriched person to a named Apify KV store keyed bypdlId(with name@domain fallback), then on the next run diffs the new enrichment against the prior snapshot. Emits a stablechangeFlags[]enum on every record:NEW_PERSON,JOB_CHANGED,TITLE_CHANGED,PROMOTION,DEMOTION,EMAIL_CHANGED,EMAIL_GAINED,EMAIL_LOST,COMPANY_DOMAIN_CHANGED,LOCATION_CHANGED,NEW_LINKEDIN,UNCHANGED. Each record also carries achangeSinceLastRunblock withpreviousJobTitle,previousCompany,previousSeniority,previousEmail,seniorityDirection(up/down/flat),daysSinceLastSeen. Schedule weekly to convert the actor from one-shot enrichment into a job-change monitor. - Filter-before-charge —
changeFlagsFilter(e.g.["JOB_CHANGED", "PROMOTION"]) andminConfidenceScorefilters apply BEFORE pushData and BEFORE PPE charging, so low-confidence or unchanged records cost nothing - Concrete recommendations on failures — every
not_found/ error record carries arecommendationstring with a specific next step ("Try Email Pattern Finder for a probable address from the company domain", "Add your own pdlApiKey", etc.) and a stablefailureTypeenum so SQL/Sheets/Zapier rules can branch on the cause - Rate-limit + 5xx resilience — exponential-backoff retry (3 attempts on 429 honouring
Retry-After, 2 attempts on 5xx, 2 attempts on network errors) instead of a single 429 retry, so transient PDL hiccups don't surface asnot_foundrows - Circuit breaker — after 5 consecutive PDL API errors the run stops cleanly with a clear status message, preventing wasted compute on dead upstreams
- Spending limit awareness — checks Apify's
eventChargeLimitReachedflag after each successful enrichment and stops cleanly without losing already-pushed data - Per-item push then charge — every record is pushed to the dataset BEFORE the PPE event fires, so you can never be charged for data that wasn't saved
- SUMMARY + OUTPUT in KV store — machine-readable run summary (matchRatePct, sourceCounts, failureCounts, confidenceCounts, changeFlagCounts, ppeChargesUsd) written to
SUMMARYandOUTPUTkeys for orchestrator quick-read; the dataset itself stays clean for spreadsheet exports - Configurable payload size — work history, education, and skills are disabled by default and enabled individually to keep payloads lean for high-volume runs
- Input echo — every output row includes
inputName,inputEmail,inputCompany,inputDomainso you can join enriched results back to your original list without manual matching - Not-found rows kept — unmatched people produce a
source: "not_found"row with nulls rather than being silently dropped, keeping your row count consistent - Custom API key support — bring your own PDL API key (free tier: 100 calls/month from PDL directly) or use the built-in shared key; BYOK removes the shared monthly ceiling
Use cases for person enrichment lookup
Sales prospecting and SDR list building
SDRs and BDRs often receive account lists with only company names and contact names — no emails, no titles, no direct dial numbers. Run this actor against your target list before any outreach sequence starts. Get verified work emails, job titles, and seniority levels back in minutes, then filter to decision-makers (VP and above) before importing to your sales engagement platform.
Marketing agency lead generation
Agencies building prospect databases for clients need enriched contacts at volume. Feed this actor a CSV of names from LinkedIn Sales Navigator exports or event attendee lists. The dual-API fallback strategy catches contacts that PDL's direct enrich misses, raising effective match rates by 15–25% over single-endpoint approaches.
Recruiting and talent sourcing
Recruiters with candidate names but no contact details can enrich a shortlist in one run. The optional includeWorkHistory flag surfaces full career timelines — useful for quickly assessing trajectory without opening each LinkedIn profile manually. Enable includeSkills to match candidates to open role requirements programmatically.
CRM data enrichment and hygiene
Stale CRM records where contacts have changed jobs are a constant problem. Export your contact list, run it through this actor, and compare jobTitle and companyName against your stored values. Records where enriched company data diverges from CRM data flag contacts who have likely moved on. Combine with HubSpot Lead Pusher to write updates back automatically.
Job-change and promotion monitoring
Set compareToPrevRun: true, schedule the actor weekly against your CRM contact list, and the actor will diff every enrichment against the prior snapshot. Records where someone moved companies come back with changeFlags: ["JOB_CHANGED"]. Promotions surface as ["PROMOTION"] (seniority moved from manager to director, etc.). Combine compareToPrevRun: true with changeFlagsFilter: ["JOB_CHANGED", "PROMOTION"] to only save records where something actually changed — costs $0 on weeks when nobody in your CRM moved. Pipe the filtered output to Slack via the Apify Slack integration for a weekly "who moved this week" digest.
B2B lead scoring and qualification
Raw contact lists need context to be useful. After enrichment, use seniority to score contacts by decision-making authority, companySize to filter by ideal customer profile, and companyIndustry to segment by vertical. Pipe the enriched output into B2B Lead Qualifier for a 0–100 composite score built from 30+ signals.
Research and due diligence
Analysts investigating company leadership, founders, or board members need structured data fast. Batch enrichment of key individuals at a target company surfaces their career history, education, and professional networks. The matchConfidence score tells you when to trust the result and when to verify manually.
How to enrich person data
-
Enter your person list — Click the
personsfield and paste a JSON array. Each object needs at minimum anameplus one ofcompany,domain, oremail. Example:[{"name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com"}]. More identifiers per person means higher match confidence. -
Configure optional fields — Toggle on
includeWorkHistory,includeEducation, orincludeSkillsif you need the richer data. Leave them off for faster, smaller-payload runs. SetmaxPersonsto cap spend if you are testing a large list. -
Run the actor — Click "Start" and wait. Processing runs at 5 people per second (200ms between PDL API calls). A list of 100 people typically completes in 25–35 seconds. A list of 1,000 completes in under 10 minutes.
-
Download results — Go to the Dataset tab, then export as JSON, CSV, or Excel. Filter the export to exclude
source: "not_found"rows if you only want matched records. Join back to your original list using theinputEmailorinputNamecolumns.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
persons | array | Yes | — | Array of person objects to enrich. Each object: name, email, company, domain, linkedinUrl (all optional, but at least one identity signal required) |
pdlApiKey | string | No | Built-in | Your PDL API key. Built-in key covers up to 100 lookups/month. Bring your own key for higher volume |
maxPersons | integer | No | 100 | Maximum persons to process in one run (1–1,000). Safety cap to prevent accidental over-billing |
includeWorkHistory | boolean | No | false | Include full job history array in output. Each entry: company, title, startDate, endDate |
includeEducation | boolean | No | false | Include education history (school, degree, field of study) in output |
includeSkills | boolean | No | false | Include skills array from PDL profile in output |
validateEmails | boolean | No | false | DNS MX check + disposable + role-account flag on every returned email; powers the composite confidenceScore |
useCompanyAliases | boolean | No | true | Retry PDL Search under known aliases when literal name returns 0 hits (Facebook→Meta, etc.) |
compareToPrevRun | boolean | No | false | Snapshot to KV store, diff against prior run, emit changeFlags[] + changeSinceLastRun |
monitorStateKey | string | No | person-enrichment-monitor | Override the KV store name for cross-run snapshots — use different names per CRM segment |
changeFlagsFilter | array | No | — | Only save records matching at least one of these change flags (e.g. ["JOB_CHANGED", "PROMOTION"]); filtered records are not charged |
minConfidenceScore | integer | No | preset default | Only save records with composite score ≥ N (recommended floor for cold outreach: 50). Defaults to the matchStrategy preset's floor (strict = 75, others = none) |
matchStrategy | enum | No | balanced | Locks PDL min_likelihood, search size, and alias retry into auditable presets. strict = high-confidence only (likelihood ≥6, default minConfidenceScore=75); balanced = current default (likelihood ≥2, full fallback); aggressive = max coverage (likelihood ≥1, top-5 candidates, accepts lower-confidence matches when company domain matches input) |
requireFields | array | No | — | Skip + don't charge records missing any of these fields. Filter runs AFTER enrichment + validation, BEFORE pushData + charge. Allowed values: email, personalEmail, phone, mobilePhone, linkedinUrl, jobTitle, companyName, companyDomain |
dedupeMode | enum | No | off | off = enrich every input row. strict = collapse exact (lowercased) name + email + linkedin matches. fuzzy = also collapses whitespace, punctuation, and company-name suffixes (Inc/Corp/Ltd) so "Acme" + acme.com == "Acme Corp" + acme.com |
skipPreviouslyEnriched | boolean | No | false | Echo cached values from monitor state instead of re-hitting PDL when a snapshot exists within freshnessWindowDays. Massive cost saver on weekly schedules. Requires a populated monitor state; first run on a new state key always hits PDL |
freshnessWindowDays | integer | No | 30 | How many days a snapshot stays "fresh" enough to skip on rerun (only used when skipPreviouslyEnriched is on). Lower = more re-enrichment, higher = more cost savings but staler data |
includeCandidates | boolean | No | false | Fetch up to 5 candidates per person from PDL Search and surface alternates under matchDebug.alternates[]. Alternates are NOT charged, NOT separate dataset rows, NOT enriched further — they exist purely so you can audit the winning match |
includeMatchDebug | boolean | No | false | Attach a matchDebug block to every record with strategyPreset, inputSignalsUsed, candidatesConsidered, and selectedReason. Auto-enabled when includeCandidates is on |
Input examples
Enrich by name and company (most common):
{"persons": [{ "name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com" },{ "name": "James Okafor", "company": "Beta Industries", "domain": "betaindustries.io" },{ "name": "Priya Nair", "company": "Vertex Solutions" }],"maxPersons": 100}
Enrich by email with full profile data:
{"persons": [{ "email": "s.chen@acmecorp.com" },{ "email": "james.okafor@betaindustries.io" },{ "linkedinUrl": "https://www.linkedin.com/in/priya-nair-vertex" }],"includeWorkHistory": true,"includeEducation": true,"includeSkills": true,"maxPersons": 50}
Quick test — single person:
{"persons": [{ "name": "Jan Curn", "company": "Apify", "domain": "apify.com" }],"maxPersons": 1}
Weekly job-change monitor (scheduled run):
{"persons": [{ "name": "Sarah Chen", "company": "Acme Corp" },{ "name": "James Okafor", "company": "Beta Industries" },{ "name": "Priya Nair", "company": "Vertex Solutions" }],"compareToPrevRun": true,"monitorStateKey": "key-accounts-q2","changeFlagsFilter": ["JOB_CHANGED", "PROMOTION", "EMAIL_CHANGED"],"validateEmails": true}
High-deliverability outreach list:
{"persons": [{ "name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com" }],"validateEmails": true,"minConfidenceScore": 75}
Input tips
- More identifiers improve match accuracy — providing
name + company + domaintogether is significantly more reliable thannamealone; email or LinkedIn URL give the highest confidence matches - Use domain alongside company name — the actor passes the domain as both
websiteand a fallbackcompanyparameter to PDL, which increases match rate for companies with non-obvious names - Set
maxPersonswhen testing — start with 5–10 to verify output quality before processing your full list; you only pay for successful enrichments - Batch in one run — processing 500 people in one run is faster and cheaper than 500 individual runs due to Apify platform overhead
- Leave optional fields off for large batches —
includeWorkHistoryandincludeSkillsadd significant payload size; only enable them when you need that data specifically
Output example
{"inputName": "Sarah Chen","inputEmail": null,"inputCompany": "Acme Corp","inputDomain": "acmecorp.com","fullName": "Sarah Chen","firstName": "Sarah","lastName": "Chen","email": "s.chen@acmecorp.com","personalEmail": "sarah.chen.personal@gmail.com","phone": "+1-415-555-0147","mobilePhone": "+1-415-555-0147","jobTitle": "Head of Product Marketing","jobTitleRole": "marketing","seniority": "director","department": "product marketing","companyName": "Acme Corp","companyDomain": "acmecorp.com","companyIndustry": "software / saas","companySize": "501-1000","companyLinkedinUrl": "https://www.linkedin.com/company/acme-corp","linkedinUrl": "https://www.linkedin.com/in/sarahchen-pm","twitterUrl": "https://twitter.com/sarahchenpm","githubUrl": null,"facebookUrl": null,"location": "San Francisco, California, United States","city": "San Francisco","state": "California","country": "United States","workHistory": [{ "company": "Stripe", "title": "Senior Product Marketing Manager", "startDate": "2019-03", "endDate": "2022-01" },{ "company": "Salesforce", "title": "Product Marketing Manager", "startDate": "2016-07", "endDate": "2019-02" }],"education": [{ "school": "University of California Berkeley", "degree": "B.S.", "field": "Marketing" }],"skills": ["go-to-market strategy", "product launches", "demand generation", "salesforce", "hubspot"],"pdlId": "qEnOZ98lib6NRxLIWAk0Vg_0000","matchConfidence": 8,"enrichedAt": "2026-03-23T14:22:11.403Z","source": "pdl_enrich"}
Example decision output (Decision Layer + confidence narrative)
The fields below are added to every record by the Decision Layer + confidence narrative — branch your downstream automation on recommendedAction, paste actionReason directly into Slack messages, and surface confidenceReason + confidenceRisks[] in CRM tasks:
{"fullName": "Sarah Chen","jobTitle": "Head of Product Marketing","companyName": "Acme Corp","email": "s.chen@acmecorp.com","confidenceScore": 83,"confidenceLevel": "high","confidenceReason": "Strong match (83/100): exact PDL match (likelihood 8/10) + work email with valid MX + rich input identifiers.","confidenceRisks": [],"isContactable": true,"recommendedAction": "SEND_TO_OUTREACH","actionPriority": "high","actionReason": "High composite confidence (83/100) with a deliverable, person-targeted work email — safe to add to an outreach sequence immediately.","identityStrength": 80,"identitySignals": ["email", "name+domain"],"notFoundAnalysis": null}
And a not_found record carrying the why-not-found analysis instead:
{"fullName": null,"inputName": "John Smith","inputCompany": "Acme","source": "not_found","failureType": "pdl-not-found","recommendedAction": "RESEARCH_MANUALLY","actionPriority": "medium","actionReason": "PDL has no record for this person. Try Email Pattern Finder for a probable email pattern from the company domain, or research the person on LinkedIn directly.","notFoundAnalysis": {"likelyCause": "name_common","causeConfidence": 0.7,"suggestion": "\"john\" is a very common name — add the company domain (e.g. \"company.com\") to narrow the match, or supply the LinkedIn URL."}}
Output fields
| Field | Type | Description |
|---|---|---|
inputName | string|null | Name from input (for join-back) |
inputEmail | string|null | Email from input (for join-back) |
inputCompany | string|null | Company from input (for join-back) |
inputDomain | string|null | Domain from input (for join-back) |
fullName | string|null | PDL canonical full name |
firstName | string|null | First name |
lastName | string|null | Last name |
email | string|null | Work/professional email |
personalEmail | string|null | Personal email (if PDL has it) |
phone | string|null | Primary phone number |
mobilePhone | string|null | Mobile phone if separately listed |
jobTitle | string|null | Current job title |
jobTitleRole | string|null | Broad role category (e.g. engineering, sales) |
seniority | string|null | Normalised seniority: c_suite, vp, director, manager, senior, entry |
department | string|null | Department/sub-role (e.g. product marketing, software) |
companyName | string|null | Current employer name |
companyDomain | string|null | Current employer website domain |
companyIndustry | string|null | Industry classification |
companySize | string|null | Employee count band (e.g. 501-1000) |
companyLinkedinUrl | string|null | Company LinkedIn page URL |
linkedinUrl | string|null | Person's LinkedIn profile URL |
twitterUrl | string|null | Twitter/X profile URL |
githubUrl | string|null | GitHub profile URL |
facebookUrl | string|null | Facebook profile URL |
location | string|null | Full location string |
city | string|null | City |
state | string|null | State or region |
country | string|null | Country |
workHistory[] | array|null | Past jobs (enabled via includeWorkHistory) |
workHistory[].company | string | Employer name |
workHistory[].title | string | Job title held |
workHistory[].startDate | string|null | Start date (YYYY-MM) |
workHistory[].endDate | string|null | End date (YYYY-MM), null if current |
education[] | array|null | Education records (enabled via includeEducation) |
education[].school | string | Institution name |
education[].degree | string|null | Degree type (e.g. B.S., MBA) |
education[].field | string|null | Field of study |
skills | array|null | Skills list (enabled via includeSkills) |
pdlId | string|null | PDL internal person ID (stable unique identifier) |
matchConfidence | number|null | PDL likelihood score 0–10 (10 = exact match) |
enrichedAt | string | ISO 8601 timestamp of enrichment |
source | string | pdl_enrich, pdl_search, pdl_search_alias, or not_found |
recordType | string | Discriminator: person for enrichment results, error for run-level errors |
emailMxValid | boolean|null | True when work email's domain has at least one MX record (validateEmails: true only) |
emailIsDisposable | boolean|null | True when work email matches a known temp-mail / burner provider |
emailIsRoleAccount | boolean|null | True when local-part is a generic role (info@, sales@, support@) — bad cold-outreach target |
confidenceScore | integer|null | Composite 0–100 score (PDL likelihood + email signals + identifier richness) |
confidenceLevel | string|null | high (≥75) / medium (≥50) / low band |
confidenceBreakdown | object|null | Per-factor point breakdown (pdlLikelihoodPoints, emailPresentPoints, mxValidPoints, identifierRichnessPoints, disposablePenalty, roleAccountPenalty) |
isContactable | boolean | True when the record is reachable AND scores ≥50 AND not disposable |
changeFlags[] | array|null | Cross-run change codes (when compareToPrevRun: true) |
changeSinceLastRun | object|null | Diff against prior snapshot |
changeSinceLastRun.previousJobTitle | string|null | Title at last snapshot |
changeSinceLastRun.previousCompany | string|null | Company at last snapshot |
changeSinceLastRun.previousSeniority | string|null | Seniority at last snapshot |
changeSinceLastRun.previousEmail | string|null | Work email at last snapshot |
changeSinceLastRun.seniorityDirection | string|null | up / down / flat |
changeSinceLastRun.daysSinceLastSeen | integer|null | Days between snapshots |
failureType | string|null | no-identifier / pdl-not-found / pdl-credits-exhausted / pdl-rate-limited / pdl-api-error / unexpected-error |
recommendation | string|null | Concrete next step when enrichment failed |
recommendedAction | string | Decision Layer enum: SEND_TO_OUTREACH / VERIFY_EMAIL / RESEARCH_MANUALLY / ENRICH_AGAIN / DROP. Deterministic — same record always produces the same action |
actionPriority | string | Routing priority: high / medium / low |
actionReason | string | One sentence explaining why this action was chosen — usable directly in CRM tasks, Slack messages, AI agent prompts |
confidenceReason | string|null | Plain-English summary of confidenceScore ("Strong match: exact PDL match (likelihood 9/10) + work email with valid MX") |
confidenceRisks[] | array|null | Stable enum tags downstream automation can branch on (e.g. ["common_name", "mx_invalid"]). Empty array on records with no risks |
identityStrength | integer|null | 0–100 score reflecting INPUT identifier richness (vs confidenceScore which scores the MATCH) |
identitySignals[] | array|null | Names of identifier signals the input row provided (e.g. ["email", "linkedinUrl"]) |
notFoundAnalysis | object|null | Why-Not-Found intelligence on not-found rows |
notFoundAnalysis.likelyCause | string | insufficient_identifiers / company_mismatch / name_common / no_pdl_coverage / pdl_search_below_threshold |
notFoundAnalysis.causeConfidence | number | 0–1 score for how certain the cause classification is |
notFoundAnalysis.suggestion | string | Concrete next step the user can act on immediately |
triggerEvents[] | array|null | Higher-priority view of changeFlags — each event carries type, priority (high/medium/low), recommendedAction, and a Slack-ready summary string. Sorted high-priority first. Null when no monitor change fired |
matchDebug | object|null | Match transparency block (when includeMatchDebug or includeCandidates is true) |
matchDebug.strategyPreset | string | Which strategy preset ran (strict / balanced / aggressive) |
matchDebug.inputSignalsUsed[] | array | Input identifier names PDL was queried with (e.g. ["name", "company", "domain"]) |
matchDebug.candidatesConsidered | integer | How many candidate matches PDL Search returned (≥1 when matched) |
matchDebug.selectedReason | string | Plain-English reason the winning candidate was selected |
matchDebug.alternates[] | array | Rejected candidates from PDL Search (when includeCandidates: true). Each: { fullName, jobTitle, companyName, companyDomain, location, matchConfidence, rejectedReason }. NOT charged, NOT separate dataset rows |
duplicateGroupId | string|null | Stable group ID shared by all duplicate input rows (when dedupeMode != 'off'). Null when dedupe is off or the row was unique |
isPrimaryRecord | boolean | Always true on the row that was actually enriched (the first member of each dedup group) |
fromCachedSnapshot | boolean | True when skipPreviouslyEnriched was on and this row was echoed from a fresh KV snapshot instead of a fresh PDL call. Cached rows are NOT charged |
How much does it cost to enrich person data?
Person Enrichment Lookup uses pay-per-event pricing — you pay $0.15 per successfully enriched person. People who return no match (source: "not_found") are not charged. Platform compute costs are included.
| Scenario | People processed | Match rate (typical) | People enriched | Total cost |
|---|---|---|---|---|
| Quick test | 10 | 80% | ~8 | ~$0.24 |
| Small batch | 50 | 80% | ~40 | ~$1.20 |
| Medium batch | 200 | 80% | ~160 | ~$4.80 |
| Large batch | 500 | 80% | ~400 | ~$12.00 |
| Enterprise batch | 1,000 | 80% | ~800 | ~$24.00 |
You can set a maximum spending limit per run to control costs. The actor stops cleanly when your budget is reached, and all already-enriched records are saved to the dataset before the run ends.
Compare this to Clay at $0.22–$5.63 per PDL enrichment depending on plan, or Apollo at $99–$249/month for a contact database. Most users running 500–1,000 enrichments per month spend $15–$25 with no subscription commitment. Apify's free tier also includes $5 of monthly credits, covering roughly 166 enrichments at no cost.
Person enrichment using the API
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("ryanclinton/person-enrichment-lookup").call(run_input={"persons": [{"name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com"},{"name": "James Okafor", "company": "Beta Industries"},{"email": "priya.nair@vertexsolutions.com"},],"maxPersons": 100,"includeWorkHistory": False,"includeEducation": False,"includeSkills": False,})for item in client.dataset(run["defaultDatasetId"]).iterate_items():if item.get("source") != "not_found":print(f"{item['fullName']} | {item['jobTitle']} @ {item['companyName']} | {item['email']} | confidence: {item['matchConfidence']}")
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const run = await client.actor("ryanclinton/person-enrichment-lookup").call({persons: [{ name: "Sarah Chen", company: "Acme Corp", domain: "acmecorp.com" },{ name: "James Okafor", company: "Beta Industries" },{ email: "priya.nair@vertexsolutions.com" },],maxPersons: 100,includeWorkHistory: false,includeEducation: false,includeSkills: false,});const { items } = await client.dataset(run.defaultDatasetId).listItems();for (const item of items) {if (item.source !== "not_found") {console.log(`${item.fullName} | ${item.jobTitle} @ ${item.companyName} | ${item.email} | confidence: ${item.matchConfidence}`);}}
cURL
# Start the actor runcurl -X POST "https://api.apify.com/v2/acts/ryanclinton~person-enrichment-lookup/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"persons": [{"name": "Sarah Chen", "company": "Acme Corp", "domain": "acmecorp.com"},{"email": "james.okafor@betaindustries.io"}],"maxPersons": 100}'# Fetch results (replace DATASET_ID from the run response above)curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How to think about this actor
In B2B enrichment, data reliability depends on three factors: input quality, match confidence, and validation signals — this actor exposes all three explicitly. Reliable enrichment requires deterministic scoring, validation, and explainability — all built into this actor. Reliability comes from deterministic scoring — the same input always produces the same confidence score, with a full breakdown of why the match was selected.
This actor separates three concepts that other enrichment tools collapse into a single confusing number:
- Identity strength — how good your INPUT data was (did you provide email, LinkedIn, name+domain, or just a name?). Surfaced as
identityStrength(0–100) +identitySignals[]. - Match confidence — how certain the enrichment RESULT is (PDL likelihood + email validity + identifier richness, with disposable / role-account penalties). Surfaced as
confidenceScore(0–100) +confidenceLevelband +confidenceBreakdownper-factor audit. - Action decision — what the user should DO with this record (
SEND_TO_OUTREACH/VERIFY_EMAIL/RESEARCH_MANUALLY/ENRICH_AGAIN/DROP). Surfaced asrecommendedAction+actionPriority+actionReason.
These three are computed independently and combined into a single deterministic output per record. Same low score, different fix:
- Low identity strength → fix your input list (add identifiers)
- Low match confidence → PDL has thin coverage on this person (try Email Pattern Finder)
- DROP / RESEARCH_MANUALLY decision → don't outreach without manual verification
The four phases below walk through how the data flows.
How Person Enrichment Lookup works
Phase 1 — Identifier assembly and PDL Enrich call
The actor builds a PDL Enrich API query from every available identifier on the person object. Email maps to email, LinkedIn URL to profile, name is split on whitespace into first_name + last_name, company maps to company, and domain maps to website. When no company name is present but a domain is, the domain is passed as both website and company to maximise PDL's fuzzy matching surface. The request requires at minimum one of: email, linkedinUrl, or name. The query always includes min_likelihood=2 and required=name to suppress spurious matches. A 30-second timeout is enforced via AbortController. HTTP 429 responses trigger one automatic retry after a 2-second pause before the request is marked as failed.
Phase 2 — PDL Search fallback
When the Enrich endpoint returns HTTP 404 (no match in PDL's direct lookup index), and the input includes both a name and a company or domain, the actor constructs an Elasticsearch DSL query against PDL's Person Search endpoint. The query uses bool.must clauses matching first_name, last_name, and either job_company_name or job_company_website (using a term filter for domain, a match clause for company name). Only the top-ranked result (size: 1) is returned. This fallback path catches people whose PDL record cannot be hit by exact identifier match but who are findable by name-and-employer combination.
Phase 3 — Data transformation and normalisation
The raw PDL profile goes through a set of deterministic transforms before being written to the dataset. Seniority is normalised from job_title_levels[] using a fixed priority ladder (c_suite takes precedence over vp, which takes precedence over director, and so on). Emails are typed: addresses with type=work or type=professional become email; addresses with type=personal become personalEmail; all other typed emails fall back to the work slot. Work history entries are mapped from PDL's experience[] structure, filtering out entries with neither company nor title. Education entries filter to records with a school name. The pdlId field carries PDL's stable internal person identifier, useful for deduplication across multiple enrichment runs.
Phase 4 — Email validation and composite confidence
When validateEmails: true, every returned work email is run through three deterministic checks: (a) DNS MX lookup with a 5-second timeout to confirm the domain has a mail server, (b) a bundled list of ~100 known disposable / temp-mail providers (mailinator, guerrillamail, 10minutemail families, etc.) to flag burner addresses, and (c) a 50-prefix table of generic role accounts (info@, sales@, support@, noreply@) to flag bad cold-outreach targets. MX results are cached per domain across the run so a batch of 50 contacts at the same company resolves the domain once. The composite confidenceScore (0–100) then combines PDL's raw likelihood (60% weight), email-present (10%), MX-valid (15%), identifier richness (5%), minus a 30-point disposable penalty and a 10-point role-account penalty. The confidenceLevel band (high ≥75 / medium ≥50 / low <50) and isContactable boolean (reachable AND score ≥50 AND not disposable) make spreadsheet filtering one-click.
Phase 5 — Cross-run change detection (optional)
When compareToPrevRun: true, the actor opens a named Apify Key-Value store (default name person-enrichment-monitor, override via monitorStateKey) and reads the prior run's snapshot map. After enriching each person, it builds a stable identity key (pdlId if available, else name@companyDomain, else name|companyName, else email) and looks up the prior snapshot. The diff function compares jobTitle, companyName, seniority, email, companyDomain, city, country, and linkedinUrl, then emits a stable changeFlags[] enum: NEW_PERSON (no prior), JOB_CHANGED (company differs), TITLE_CHANGED (title differs at same company), PROMOTION / DEMOTION (seniority moved up or down), EMAIL_CHANGED / EMAIL_GAINED / EMAIL_LOST, COMPANY_DOMAIN_CHANGED, LOCATION_CHANGED, NEW_LINKEDIN, UNCHANGED. Each record also carries a changeSinceLastRun block with the prior values, the seniority direction (up / down / flat), and the days between snapshots. The full snapshot map is persisted back to KV at the end of the run so the next scheduled execution picks up where this one left off.
Phase 6 — Filtering, charging, and output
Filters apply BEFORE pushData and BEFORE PPE charging — changeFlagsFilter (e.g. ["JOB_CHANGED", "PROMOTION"]) keeps only records where something interesting changed; minConfidenceScore keeps only records above a quality floor. Filtered records are not saved and not charged. PPE charging fires only after a kept record is pushed to the dataset and confirmed as source !== "not_found" — ensuring you never pay for a charge where the data was not saved. The actor monitors chargeResult.eventChargeLimitReached after every charge event and stops cleanly if your spending limit is reached. A 5-consecutive-failure circuit breaker stops the run if PDL is throwing repeated errors, preserving partial results. The run summary (matchRatePct, sourceCounts, failureCounts, confidenceCounts, changeFlagCounts, ppeChargesUsd) is written to the SUMMARY and OUTPUT keys in the default Key-Value store for orchestrators and dashboards to read; the dataset itself stays clean of summary noise so spreadsheet exports look uniform.
What this actor does NOT do
Honest scope so you don't pay for a tool that won't fix your problem:
- Not a multi-source waterfall. This is a single-source PDL enricher with smart fallbacks (Search → Search-with-aliases). For Apollo + Cognism + Hunter + Datagma cascading, use Waterfall Contact Enrichment.
- Not a LinkedIn scraper. No HTML scraping of linkedin.com — that violates LinkedIn's Terms of Service. The actor only queries People Data Labs' API, which aggregates from public professional sources.
- Not an SMTP verifier. The
emailMxValidflag confirms the domain has a mail server, but does NOT send a test email or run RCPT TO probes. For full SMTP-level verification (catch-all detection, mailbox existence), pipe results through Bulk Email Verifier. - Not an email-pattern guesser. When PDL has no record (
failureType: "pdl-not-found"), this actor doesn't fabricate a probable email from the domain. For pattern fallback, chain thenot_foundrows into Email Pattern Finder. - Not a real-time sales platform. This is a per-batch enrichment tool, not a live data feed. PDL refreshes its database on a rolling basis with profile lag of 2–6 months. For real-time intent signals (hiring, funding, news), use Intent Signal Tracker.
- Not a tech-stack detector. Company industry and size come from PDL; the tools a company uses do not. Combine with Decision-Grade Website Intelligence for technographics.
- Not a buying-committee mapper. This actor enriches one person at a time. For grouping multiple contacts at the same company by buying-committee role (decision-makers, champions, blockers), use Website Contact Scraper which emits a
buyingCommitteeblock. - Not an Apollo / ZoomInfo replacement at the platform level. Those are full sales-intelligence platforms with prospecting search, intent signals, lists, and CRM sync. This actor is the per-person enrichment primitive — best chained into your own pipeline, not used as a CRM front-end.
Run summary in the Key-Value store
Every run writes a machine-readable summary to the default Key-Value store under both the SUMMARY and OUTPUT keys. Inspect via the Apify Console "Storage" tab or fetch via API:
$curl "https://api.apify.com/v2/key-value-stores/<storeId>/records/SUMMARY"
Shape:
{"totalInput": 100,"totalUniqueAfterDedup": 96,"totalProcessed": 96,"enrichedCount": 78,"notFoundCount": 18,"pushedCount": 96,"chargedCount": 78,"filteredOut": 0,"skippedAsCached": 12,"duplicatesCollapsed": 4,"matchRatePct": 81,"ppeChargesUsd": 11.70,"spendingLimitReached": false,"creditsExhausted": false,"circuitBroken": false,"usingBuiltInKey": false,"matchStrategy": "balanced","strategyDescription": "Default mode — PDL likelihood >=2, full fallback chain (Enrich -> Search -> Alias-Search), no automatic confidence floor. Best general-purpose mode.","requireFields": [],"sourceCounts": { "pdl_enrich": 67, "pdl_search": 9, "pdl_search_alias": 2, "not_found": 18 },"failureCounts": { "pdl-not-found": 18 },"confidenceCounts": { "high": 60, "medium": 18, "low": 18 },"changeFlagCounts": { "JOB_CHANGED": 4, "PROMOTION": 2, "UNCHANGED": 72, "NEW_PERSON": 0 },"actionDistribution": {"SEND_TO_OUTREACH": 52,"VERIFY_EMAIL": 18,"RESEARCH_MANUALLY": 18,"DROP": 8},"segments": [{ "name": "decision_makers", "label": "Decision makers", "filter": "seniority IN (c_suite, vp, director)", "count": 49, "memberKeys": ["sarah.chen@acmecorp.com", "..."] },{ "name": "contactable_high_confidence", "label": "Contactable + high confidence", "filter": "isContactable = true AND confidenceLevel = \"high\"", "count": 52, "memberKeys": ["..."] },{ "name": "promotion_triggers", "label": "Promotion triggers (since last run)", "filter": "changeFlags contains PROMOTION", "count": 2, "memberKeys": ["..."] },{ "name": "new_job_movers", "label": "New job movers (since last run)", "filter": "changeFlags contains JOB_CHANGED", "count": 4, "memberKeys": ["..."] },{ "name": "requires_verification", "label": "Requires email verification before outreach", "filter": "recommendedAction = VERIFY_EMAIL", "count": 18, "memberKeys": ["..."] },{ "name": "low_quality_drop_candidates", "label": "Low quality / drop candidates", "filter": "recommendedAction = DROP", "count": 8, "memberKeys": ["..."] }],"cohortInsights": {"topIndustries": [{ "industry": "software / saas", "count": 31, "pct": 40 },{ "industry": "fintech", "count": 14, "pct": 18 },{ "industry": "healthcare", "count": 9, "pct": 12 }],"topCompanySizes": [{ "size": "1001-5000", "count": 22, "pct": 28 },{ "size": "501-1000", "count": 18, "pct": 23 },{ "size": "11-50", "count": 12, "pct": 15 }],"topCountries": [{ "country": "United States", "count": 52, "pct": 67 },{ "country": "United Kingdom", "count": 11, "pct": 14 },{ "country": "Germany", "count": 6, "pct": 8 }],"seniorityDistribution": { "vp": 24, "director": 19, "manager": 17, "senior": 12, "c_suite": 6 },"contactableRatePct": 72,"avgConfidenceScore": 81,"duplicateRatePct": 4},"mxLookupsCached": 41,"completedAt": "2026-05-01T14:22:11.403Z"}
This is what dashboards, orchestrators, and AI agents should read — the dataset stays clean for spreadsheet exports.
Tips for best results
-
Provide email when you have it. Email is PDL's highest-confidence identifier. A lookup by email alone typically returns
matchConfidenceof 8–10, versus 5–7 for name + company alone. -
Include the company domain alongside the company name. The actor passes domain as both
websiteand a fallbackcompanyparameter. This catches PDL records where the company name is stored differently than you have it (e.g. "Acme" vs "Acme Corp" vs "Acme Corporation"). -
Filter output by
matchConfidencebefore importing to CRM. Scores of 6 and above are generally reliable. Scores of 2–4 from the Search fallback path should be manually verified before sending outreach. -
Use
sourceto understand your match rate. After a run, count records bysourcevalue:pdl_enrich(best),pdl_search(good, fallback),not_found(no match). A highnot_foundrate often means input names need cleaning or the people are not in PDL's database (common for very small companies). -
For large batches, set
maxPersonsto a safe cap. If you have 5,000 names but are unsure of match rate, run 100 first to gauge quality and match rate before committing to the full list. -
Combine with Email Pattern Finder for un-enrichable contacts. When PDL returns
not_found, you can still derive a probable work email by running the domain through Email Pattern Finder to get the company's email naming convention (e.g.{first}.{last}@domain.com). -
Schedule weekly runs to catch job changes. People change roles every 18–24 months on average. A weekly enrichment run against your active CRM contacts will surface
companyNameorjobTitlechanges that indicate a contact has moved on. -
Bring your own PDL API key for enterprise volume. The built-in key is shared and subject to PDL's free-tier limits. A PDL paid account starts at $98/month and provides significantly higher call limits if you are running thousands of lookups daily.
-
Use
validateEmails: truewhen the output feeds an outreach tool. The DNS MX check is fast (cached per domain) and the disposable + role-account flags prevent paying enrichment fees on emails that will bounce or hit a generic inbox. Sort byconfidenceScoredescending and filter byisContactable === truefor a clean outreach list. -
Schedule weekly with
compareToPrevRunfor job-change alerts. Set the schedule, point it at your CRM contact list, and addchangeFlagsFilter: ["JOB_CHANGED", "PROMOTION"]. Only records where someone actually moved or got promoted are saved (and charged). Pipe to Slack via Apify's Slack integration for a "who moved this week" digest. -
Use a different
monitorStateKeyper CRM segment. The default state key isperson-enrichment-monitor. If you're running monitors for multiple campaigns or segments (e.g. enterprise vs SMB, or different sales territories), set distinctmonitorStateKeyvalues per run so each segment keeps its own snapshot history.
Typical pipelines
Most teams chain this actor into one of three workflows. Each step's output flows into the next via Apify webhooks, Make / Zapier, or direct dataset reads.
Sales prospecting pipeline (cold outreach):
- Scrape contacts from company websites with Website Contact Scraper
- Enrich every contact with this actor (
validateEmails: true,requireFields: ["email", "jobTitle"]) - Verify deliverability with Bulk Email Verifier on
recommendedAction = VERIFY_EMAILrows - Push
recommendedAction = SEND_TO_OUTREACHrows into your sequencing tool (HubSpot / Outreach / Apollo / Lemlist) via HubSpot Lead Pusher or a Zapier webhook
CRM hygiene + job-change monitoring (scheduled weekly):
- Export your active CRM contact list to Apify
- Run this actor with
compareToPrevRun: true,changeFlagsFilter: ["JOB_CHANGED", "PROMOTION", "EMAIL_CHANGED"],skipPreviouslyEnriched: true - Pipe the filtered output to Slack for a "who moved this week" digest, or push back to your CRM to update stale records
Recruiting / talent sourcing:
- Build a candidate shortlist (LinkedIn Sales Navigator export, GitHub repo contributors, conference attendee list)
- Run this actor with
includeWorkHistory: true,includeEducation: true,includeSkills: true,matchStrategy: 'aggressive' - Read the
decision_makersandcontactable_high_confidencesegments from the SUMMARY KV value to prioritise outreach
Combine with other Apify actors
| Actor | How to combine |
|---|---|
| Website Contact Scraper | Scrape a company website to find employee names and partial emails, then enrich each person here for complete profiles |
| Google Maps Email Extractor | Extract business owner contacts from Google Maps, then enrich each contact with job title and seniority via this actor |
| Email Pattern Finder | For contacts this actor returns as not_found, use Email Pattern Finder to derive a probable email from the company domain |
| Bulk Email Verifier | Verify the email field returned by enrichment before sending outreach — confirms MX record and SMTP deliverability |
| B2B Lead Qualifier | Pipe enriched records into the qualifier to score each contact 0–100 from seniority, company size, industry, and 27 other signals |
| HubSpot Lead Pusher | Write enriched contacts directly into HubSpot with mapped field names |
| Waterfall Contact Enrichment | Use as the PDL step in a 10-source waterfall cascade to maximise total enrichment coverage |
| Lead Enrichment Pipeline | All-in-one Clay alternative: email discovery, verification, company research, and scoring in one run ($0.12/lead) |
| AI Outreach Personalizer | Generate personalized cold emails using your own OpenAI/Anthropic key — zero AI markup ($0.01/lead) |
| Intent Signal Tracker | Track buying signals: hiring, tech changes, funding, content updates. Prioritize outreach by intent score ($0.05/company) |
| Lead Data Quality Auditor | Audit lead data quality before outreach — email verification, phone validation, domain freshness ($0.005/record) |
Limitations
- PDL database coverage — PDL covers approximately 3 billion professional profiles, but match rates vary by region and company size. Expect 70–85% match rates for US/UK professionals at companies with 50+ employees. Rates drop to 40–60% for freelancers, very small businesses, and professionals in markets where LinkedIn adoption is low.
- No JavaScript rendering — this actor makes direct HTTP calls to the PDL API and does not scrape websites. If you need to extract contact details from JavaScript-rendered company pages before enriching, use Website Contact Scraper Pro first.
- Data freshness — PDL refreshes its database on a rolling basis, but profiles may lag behind real-world job changes by 2–6 months.
matchConfidencedoes not reflect recency, only identity match certainty. - 1,000 persons per run maximum — the
maxPersonscap is enforced at 1,000. For larger lists, split into multiple runs or use the API to queue batches. - Built-in API key limit — the shared built-in PDL key covers a limited number of lookups per month across all users of the actor. For reliable high-volume use, bring your own PDL API key.
- No bulk-lookup discount — PDL charges per API call regardless of batch size on the calling side. This actor's pricing reflects per-enriched-person charges.
- Seniority for non-English titles — PDL's
job_title_levelsnormalisation is less reliable for job titles in non-English languages. The seniority field may be null for some international profiles. - Phone number availability — PDL has phone data for roughly 20–30% of profiles. The
phoneandmobilePhonefields will be null for the majority of enriched contacts.
Integrations
- Zapier — trigger enrichment runs from a Zap when new contacts are added to a spreadsheet or CRM, then route enriched records to any Zapier-connected app
- Make — build enrichment automations that run on a schedule or in response to form submissions, webhooks, or CRM events
- Google Sheets — export enriched contact lists directly to a Sheet for team review or import into marketing tools
- Apify API — trigger runs programmatically from your own backend, CRM integration scripts, or data pipeline
- Webhooks — post enriched dataset results to your CRM or data warehouse endpoint immediately when a run completes
- LangChain / LlamaIndex — use enriched person data as structured context for AI sales agents, automated outreach writers, or research assistants
Troubleshooting
-
High
not_foundrate despite correct names — verify your company names match how PDL stores them. "Meta" versus "Facebook" or "Alphabet" versus "Google" can cause misses on the Search fallback path. Try adding the domain (domain: "meta.com") alongside the company name to give PDL a domain-based match path. -
matchConfidenceis low (2–4) on returned records — low scores typically come from the PDL Search fallback path. These are fuzzy matches on name + employer. Always filter output bymatchConfidence >= 5before importing to CRM or sending outreach. Low-confidence records should be manually verified. -
Run stops early with "PDL API credits exhausted" in the status — this means the built-in API key has hit its monthly limit. Either wait for the calendar month to reset, or add your own PDL API key in the
pdlApiKeyfield. PDL's free tier provides 100 API calls/month per account atpeopledatalabs.com. -
Run stops with "Spending limit reached" — your Apify run budget cap was hit. This is by design. Increase your spending limit in the actor run settings, or process a smaller batch. All records enriched before the limit was reached are saved in the dataset.
-
Email field is null but personalEmail has a value — PDL has a personal email on file but no verified work email. You can use Email Pattern Finder on the person's company domain to derive the likely work email format.
Responsible use
- This actor only queries the People Data Labs API using data you provide. It does not scrape any websites or access private systems.
- PDL's data is aggregated from public professional sources including LinkedIn, company websites, and professional directories.
- Comply with GDPR, CAN-SPAM, CASL, and other applicable data protection laws when using enriched contact data for outreach.
- Do not use enriched data for spam, harassment, profiling without lawful basis, or any purpose that violates the terms of service of the platforms from which the underlying data originates.
- For guidance on the legal framework around B2B contact data, see Apify's guide on web scraping legality.
FAQ
How many people can I enrich in one run with Person Enrichment Lookup?
Up to 1,000 persons per run. Set maxPersons in your input to control the limit. For lists larger than 1,000, split into multiple runs or use the Apify API to queue batches sequentially.
What identifiers does person enrichment require?
Each person needs at minimum one of: email address, LinkedIn URL, or full name. For name-only records, at least one of company or domain is required to enable the Search fallback. More identifiers produce higher matchConfidence scores.
How accurate is person enrichment from PDL?
PDL's direct Enrich endpoint (email or LinkedIn-based lookups) consistently returns matchConfidence of 8–10. Name + company searches via the fallback path return 5–7. Match rates vary by region: 75–85% for US/UK enterprise contacts, 40–65% for other markets. The matchConfidence score on every record tells you how much to trust each match.
How long does a typical person enrichment run take? The actor enforces a 200ms delay between PDL API calls to respect rate limits. A batch of 100 people completes in approximately 30 seconds. A batch of 1,000 completes in 4–8 minutes depending on how many Search fallback calls are needed.
Does person enrichment work for people outside the United States? Yes, PDL has global coverage, but match rates are higher for professionals in markets with strong LinkedIn adoption (US, UK, Canada, Australia, Western Europe). Match rates are lower for professionals in markets where LinkedIn is less dominant.
How is Person Enrichment Lookup different from Clay? Both use the People Data Labs API as a data source. Clay charges $0.22–$5.63 per PDL enrichment depending on your subscription tier, plus a monthly platform fee. This actor charges $0.15 per successfully enriched person with no subscription. You only pay for matches, not misses. The actor also provides a Search fallback path that Clay's PDL action does not expose, improving match rates for name + company inputs.
How is this different from Apollo or ZoomInfo? Apollo ($99–$249/month) and ZoomInfo ($15,000+/year) are full sales intelligence platforms. This actor is a single-purpose enrichment tool that uses PDL's independent B2B database. It costs less per lookup, requires no subscription, and outputs clean structured JSON that maps directly into any CRM or data pipeline. It does not replace the prospecting search features of Apollo or ZoomInfo.
Can I use my own People Data Labs API key?
Yes. Add your PDL API key to the pdlApiKey field. The built-in key is shared and subject to PDL's free-tier rate limits. Bringing your own key removes that ceiling. PDL's free tier provides 100 API calls/month; paid plans start at $98/month.
Is person enrichment legal? PDL aggregates data from public professional sources. Querying PDL's API to enrich business contact data for legitimate B2B purposes is generally lawful. You must comply with applicable data protection regulations (GDPR, CAN-SPAM, CASL) in how you use the enriched data. For detailed guidance see Apify's legal guide.
What happens if a person is not found in PDL?
The actor records a result row with source: "not_found" and null values for all enriched fields. You are not charged for not-found records. The row is still written to the dataset so your row count matches your input list, making join-back straightforward.
Can I schedule person enrichment to run automatically? Yes. Use Apify's built-in scheduler to run enrichment on a daily, weekly, or custom cron schedule. This is useful for refreshing CRM contact data on a recurring basis to catch job changes.
Can I combine person enrichment with other actors in an automated pipeline? Yes. The most common pipeline is: scrape company websites with Website Contact Scraper → enrich contacts here → verify emails with Bulk Email Verifier → push to HubSpot with HubSpot Lead Pusher. You can connect these steps via Apify webhooks or Make/Zapier automations.
How does the composite confidence score work?
The confidenceScore (0–100) is a weighted combination of: PDL likelihood (60% weight, mapped from 0–10 to 0–60 points) + email present (10 points) + MX valid (15 points) + identifier richness (up to 5 points based on how many distinct identifiers the input had) − disposable email penalty (30 points) − role-account penalty (10 points). The confidenceLevel band is derived from the score: high ≥75, medium ≥50, low <50. The confidenceBreakdown object on every record shows exactly which factors contributed how many points so you can audit the score yourself. Use minConfidenceScore: 50 to filter out junk records before saving.
How does cross-run change detection work?
Set compareToPrevRun: true and the actor opens a named Apify Key-Value store (default person-enrichment-monitor). It loads the prior run's snapshot map, enriches each person, looks them up by stable identity key (PDL ID first, falling back to name+domain), diffs the new record against the prior snapshot, and emits a changeFlags[] array with stable enum values: JOB_CHANGED, TITLE_CHANGED, PROMOTION, DEMOTION, EMAIL_CHANGED, LOCATION_CHANGED, NEW_PERSON, UNCHANGED, etc. The full updated snapshot map is persisted back to KV at the end of the run. First run on a new state key produces NEW_PERSON flags on every record (no prior to compare against). Schedule weekly to convert the actor from one-shot enrichment into a job-change monitor.
What happens when the actor encounters an alias-rebranded company like Facebook?
With useCompanyAliases: true (default on), if PDL Search returns 0 hits for the literal company name (e.g. "Facebook"), the actor automatically retries with each known alias for that company ("Meta Platforms", "Meta", "Facebook Inc"). When an alias hits, the record's source field is set to pdl_search_alias so you can filter on aliased matches if needed. Roughly 70 alias groups are baked in, covering most Fortune 500 rebrands (Alphabet↔Google, X↔Twitter, Block↔Square, HPE / HP / Hewlett-Packard, etc.).
How do I avoid being charged for low-confidence or unchanged records?
Both filters apply BEFORE pushData and BEFORE PPE charging, so filtered records cost nothing. For confidence: set minConfidenceScore: 50 (or higher for cold outreach). For change-only monitoring: set compareToPrevRun: true plus changeFlagsFilter: ["JOB_CHANGED", "PROMOTION", "EMAIL_CHANGED"]. Records that fail the filter are dropped from the dataset entirely and never trigger a PPE charge — perfect for weekly scheduled runs where most contacts haven't moved.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.