Deduped Job Intelligence for AI Agents
Pricing
from $3.50 / 1,000 valid jobs
Deduped Job Intelligence for AI Agents
Extract, normalize, and deduplicate public job postings into clean hiring-signal records with source evidence, role classification, and confidence scores.
Pricing
from $3.50 / 1,000 valid jobs
Rating
0.0
(0)
Developer
DeepAPI
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
Extract, normalize, and deduplicate public job postings into clean hiring-signal records with source evidence, role classification, duplicate groups, and confidence scores.
Use this Actor when you need hiring intelligence rather than another raw job feed. Provide public careers, job-board, or direct posting URLs, and the Actor returns validated dataset rows that are ready for sales workflows, recruiting research, market analysis, and AI agents.
What this Actor does
- Crawls user-supplied public careers, job-board, and direct job posting URLs.
- Extracts job candidates from supported ATS/job-board pages and fallback job links.
- Visits extracted job detail pages to enrich records with location, remote type, seniority, and evidence when available.
- Extracts public salary or compensation ranges from structured job data and visible job detail text when available.
- Normalizes job titles for cleaner grouping and analysis.
- Detects common role functions, seniority signals, locations, and remote types when present.
- Matches optional role keywords such as
account executive,customer success, orsales engineer. - Groups duplicate postings with a stable
duplicateGroupId. - Preserves source URLs and evidence for auditability.
- Filters output with
maxJobsandminConfidence. - Returns validated dataset rows suitable for automation and AI-agent workflows.
Use cases
- B2B sales teams using hiring as an account signal.
- Lead generation agencies enriching company lists.
- Recruiters monitoring active roles across target companies.
- Investors and analysts tracking hiring momentum.
- AI agents that need normalized hiring data rather than raw HTML or duplicate job feeds.
Supported source types
Use public source URLs such as company careers pages, public job-board pages, direct public job posting URLs, and public ATS pages.
The output schema currently detects these providers when available: Greenhouse, Lever, Ashby, Workable, SmartRecruiters, Teamtailor, Breezy, and Workday.
Provider coverage depends on the public page structure available at crawl time.
Input
Use public HTTP(S) sourceUrls as the primary input. Optional filters let you cap results, match role keywords, filter by locations, include duplicate source URLs, set clean-mode requirements, and set a minimum confidence threshold.
{"sourceUrls": ["https://jobs.ashbyhq.com/notion","https://jobs.lever.co/posthog"],"maxJobs": 25,"roleKeywords": ["account executive","customer success","sales engineer","solutions engineer","engineer"],"locations": ["United States", "Remote"],"includeSourceDuplicates": true,"requireLocation": false,"requireSalary": false,"requireRoleKeywordMatch": false,"minConfidence": 0.65}
Clean mode toggles can return only jobs that are ready for downstream routing:
| Toggle | Effect |
|---|---|
requireLocation | Return only jobs with a detected location. |
requireSalary | Return only jobs with a detected public salary range. |
requireRoleKeywordMatch | Return only jobs that matched at least one configured role keyword. |
Output
Each pushed dataset item is a validated, normalized job intelligence record.
{"companyName": "Notion","jobTitle": "Account Executive, Commercial","normalizedTitle": "Account Executive Commercial","function": "sales","location": "New York, United States","salary": {"currency": "USD","min": 150000,"max": 180000,"period": "year"},"sourceUrl": "https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149","sourceUrls": ["https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149","https://jobs.ashbyhq.com/notion/fdc2a6c0-396a-45db-b465-683bacf4201e"],"sourceUrlsText": "https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149 | https://jobs.ashbyhq.com/notion/fdc2a6c0-396a-45db-b465-683bacf4201e","jobBoardProvider": "ashby","duplicateGroupId": "notion:account-executive-commercial","duplicateCount": 2,"roleKeywords": ["account executive"],"roleKeywordsText": "account executive","hiringSignal": "Hiring Account Executive, Commercial","evidence": [{"type": "job_title","value": "Account Executive, Commercial","sourceUrl": "https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149"},{"type": "role_keyword","value": "account executive","sourceUrl": "https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149"},{"type": "salary","value": "USD 150000-180000 per year","sourceUrl": "https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149"}],"evidenceSummary": "job_title: Account Executive, Commercial (https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149) | salary: USD 150000-180000 per year (https://jobs.ashbyhq.com/notion/9526496b-5c39-456b-a454-ebec889e7149)","salaryText": "USD 150000-180000 per year","confidence": 0.95,"scrapedAt": "2026-06-29T15:52:57.933Z"}
Spreadsheet preview:
| companyName | normalizedTitle | function | location | salaryText | roleKeywordsText | duplicateCount | sourceUrlsText |
|---|---|---|---|---|---|---|---|
| Notion | Account Executive Commercial | sales | New York, United States | USD 150000-180000 per year | account executive | 2 | https://jobs.ashbyhq.com/notion/9526496b... | https://jobs.ashbyhq.com/notion/fdc2a6c0... |
Example results
A local sample run against two public sources produced:
Source URLs: 2Requests processed: 2Raw jobs found: 74Valid jobs returned: 25Pushed jobs: 25
The sample included merged duplicate sources for equivalent postings, role keyword matches, function detection, and source-backed evidence.
Output Fields For CSV Buyers
The dataset keeps rich arrays and objects for agents, but also includes flat text fields for spreadsheet exports:
| Field | Meaning |
|---|---|
sourceUrlsText | Pipe-separated source URLs represented by the deduplicated job. |
roleKeywordsText | Pipe-separated matched role keywords. |
evidenceSummary | Pipe-separated source evidence summary for review in CSV, Excel, or Sheets. |
salaryText | Human-readable salary range when detected. |
Troubleshooting
| Symptom | Most likely reason | What to do |
|---|---|---|
No validJob rows | The public page did not expose job links/data matching your filters and minConfidence. | Lower minConfidence, remove restrictive locations, or pass the direct ATS/job-board URL. |
| Jobs are missing locations or salary | The public job detail page does not expose those fields in visible text or JSON-LD. | Keep the row if the title/source evidence is enough, or filter with locations only when location is required. |
| Fewer rows than expected | Duplicates were merged, charge limits stopped writes, or maxJobs capped output. | Check OUTPUT.summary, duplicateCount, and sourceUrlsText. |
Limitations
- Works with public pages only.
- Does not log in to private job boards or social networks.
- Does not collect applicant data.
- Does not guarantee complete coverage for every ATS or custom careers site.
- Salary, department, seniority, location, and remote type are returned only when detected from public source content.
- Provider support can vary when public page markup changes.
Local Development
From the portfolio root:
pnpm jobs:testpnpm jobs:typecheckpnpm jobs:sample
The sample script uses a separate storage-live-sample local storage directory and writes inspection files to data/live-sample.
Pricing Unit
Recommended pay-per-event unit:
validJob
The Actor will charge only when a normalized job passes validation and is pushed to the dataset.