Job Posting Extractor (title, company, salary, remote) avatar

Job Posting Extractor (title, company, salary, remote)

Pricing

Pay per usage

Go to Apify Store
Job Posting Extractor (title, company, salary, remote)

Job Posting Extractor (title, company, salary, remote)

Extract clean, normalized job data — title, company, location, remote flag, salary min/max/currency/period, employment type, dates, apply URL — from public job pages via JSON-LD JobPosting. HTML-only, fast, structured output.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Tommy G

Tommy G

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Job Posting Extractor (Apify Actor)

Give it public job posting URLs, get back clean, normalized job data — title, company, location, remote flag, salary (min/max/currency/period), employment type, date posted, valid through, apply URL — from JSON-LD JobPosting. HTML-only (no headless browser), fast, cheap. Ideal for job-board aggregation, salary datasets, and recruiting/market research.

Input

{ "startUrls": [{ "url": "https://boards.greenhouse.io/acme/jobs/123" }], "maxConcurrency": 5, "maxPages": 100 }

maxPages capped at 200, maxConcurrency at 20.

Output — one STABLE record per URL (ok and error rows share the shape)

{
"status": "ok",
"requested_url": "...",
"final_url": "...",
"http_status": 200,
"found": true,
"complete": true,
"page_type": "job",
"source": "json-ld",
"title": "Senior Backend Engineer",
"company": "Acme Corp",
"company_logo": "https://acme.com/logo.png",
"location": "Berlin, BE, DE",
"remote": false,
"employment_type": "FULL_TIME",
"salary_min": 70000,
"salary_max": 95000,
"salary_currency": "EUR",
"salary_period": "YEAR",
"date_posted": "2026-05-01T00:00:00.000Z",
"valid_through": "2026-07-01T00:00:00.000Z",
"description": "...",
"identifier": "JOB-123",
"apply_url": "https://acme.com/jobs/be-eng",
"fields_found": ["title", "company", "location", "salary_min", "employment_type", "date_posted", "apply_url"],
"missing_reason": null,
"extracted_at": "2026-05-29T..."
}

page_type is job | listing | unknown. A search/listing URL returns found:false + page_type:"listing" (it's not a single posting). complete = title + company + location/remote. Failed fetches return the same keys with status:"error" + error.

Run locally / test

npm install
npm test # unit tests on the pure extractor (node:test)

Publish to Apify (account-holder's step)

$npm install -g apify-cli && apify login && apify push

Keep free initially; enable pricing later via the adult account-holder once it shows repeat usage.

Notes / safety

  • SSRF-guarded, robots-respecting, rate-limited, cost-capped (shared src/lib/actor_runner.js).
  • Stores only derived job fields — no raw page bodies. Salary often legitimately absent (left null).
  • HTML-only: client-rendered job boards that inject JSON via JS return found:false with missing_reason:"js_rendered". Core logic in src/extract.js (pure, unit-tested).