Pricing

from $3.00 / 1,000 job results

AI & ML Engineer Jobs Scraper — 8 Boards in One

AI & ML jobs aggregator: one run merges 8 boards (aijobs.net, LinkedIn, Hacker News Who-is-Hiring, Y Combinator, Built In, RemoteOK/Remotive/WeWorkRemotely, WTTJ, JustJoin.it) into a URL-deduped dataset with structured salary, remote flag and seniority. Delta mode for daily alerts.

Pricing

from $3.00 / 1,000 job results

Rating

0.0

(0)

Developer

Nomad.Dev

Actor stats

Bookmarked

Total users

Monthly active users

6 days ago

Last modified

What machine learning jobs data does this scraper extract?

Each result is one flat JSON record per job posting. The same structured columns are filled for every source (from whichever field that source exposes), so you never have to special-case per board:

Field	Type	Meaning
`source`	string	Which child board the record came from, e.g. `"ai_jobs_net"`
`id`	string	Stable source-side identifier (`""` when the source has none)
`title`	string	Job title as posted
`company`	string	Hiring company / organisation
`location`	string	Location / duty station (may include remote hints)
`url`	string	Direct link to the posting (primary dedupe / delta key)
`postedAt`	string	Posting date where the source provides it, else `""`
`snippet`	string	Short description excerpt
`description`	string	Full description text where the source exposes one, else `""`
`salary`	string	Human-readable salary text, or composed from the numbers below
`salaryMin`	number \| null	Lower bound of the pay range as a number
`salaryMax`	number \| null	Upper bound of the pay range as a number
`salaryCurrency`	string \| null	Currency code (USD, EUR, GBP, PLN, …)
`salaryPeriod`	string \| null	Pay period, one of `year`/`month`/`week`/`day`/`hour`
`isRemote`	boolean \| null	`true` only for fully-remote, `false` for hybrid/on-site, `null` if unknown
`remoteType`	string \| null	`remote` / `hybrid` / `on-site` (keeps the hybrid nuance)
`seniority`	string \| null	Experience level, e.g. `"Senior"`, `"Mid-Senior level"`
`employmentType`	string \| null	Commitment / contract type, e.g. `"Full-time"`, `"Contract"`

Structured salary and the remote/seniority/type fields are populated from each source's own structured data where available: ai_jobs_net, builtin, remote_boards, wttj and justjoinit expose salaryMin/Max/Currency/Period directly; for ycombinator_was and hackernews the bundle parses the numbers out of the free-text pay line when the poster includes one. linkedin doesn't quote pay on its listings, so its salary fields stay empty/null; a hackernews post that names no figure is left null too. Any field a source doesn't provide is "" (strings) or null (numbers/booleans) rather than fabricated.

How the bundle works

This is a bundle Actor: one endpoint that runs every job-board scraper listed below in-process — their code ships inside this Actor, so no child actor runs are launched and no per-source fees stack on top. You pay this bundle's pay-per-event pricing only. Sources run concurrently, every record is mapped onto one flat schema with salary/remote/seniority normalized uniformly, and results are deduped by URL across boards. You can restrict the run to a subset with the sources input. Each source fails open independently: if one board errors or times out, the others still return.

Delta mode — only new jobs since last run

Turn on onlyNewSinceLastRun for a scheduled alert bot. The bundle remembers every listing URL it has delivered (in a private key-value store on your own Apify account) and, on later runs, drops — and does not charge its per-result fee for — anything you already received. The first run returns everything; subsequent runs surface only fresh roles. Leave it off for a full snapshot each run.

How to scrape machine learning jobs with this Actor

Click Try for free / Run — no login to the target site, no cookies, no proxies to configure.
Adjust the input (keyword, filters, maxItems) or keep the defaults.
Run it and export the dataset as JSON, CSV or Excel, or read it over the API.

Run it from your own code:

from apify_client import ApifyClient

client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("nomad-agent/ml-ai-dev-bundle").call(run_input={"maxItems": 50})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["title"], "—", item["company"], item["url"])

Or a single HTTP call that runs the Actor and returns items in one response:

curl -X POST \
  "https://api.apify.com/v2/acts/nomad-agent~ml-ai-dev-bundle/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"maxItems": 50}'

Input

Field	Type	Default	Notes
`sources`	array	`["linkedin", "ai_jobs_net", "hackernews", "ycombinator_was", "builtin", "remote_boards", "wttj", "justjoinit"]`	Which boards to include. Leave empty to run the full default set. All sources run in-process — no per-source fees.
`keyword`	string	`""`	Optional free-text filter forwarded to sources that support it (others ignore it).
`onlyNewSinceLastRun`	boolean	`false`	Delta mode — return (and bill) only listings not delivered on a previous run with this flag on. See "Delta mode" above.
`maxItemsPerSource`	integer	`36`	Cap on items fetched from EACH board before merge.
`maxItems`	integer	`288`	Hard cap on the merged, deduped output. Default is sources × `maxItemsPerSource` (the zero-config ceiling). Set `0` for no cap.
`cacheTtlSeconds`	integer	`1800`	How long to reuse results already fetched from a source instead of re-fetching. `0` = always fetch fresh.
`concurrency`	integer	`6`	How many boards to fetch in parallel. (Advanced)
`runTimeoutSecs`	integer	`240`	How long to give each source before returning what it has collected so far. Sources run in parallel, so this is a per-source ceiling, not a budget shared across them. (Advanced)
`apifyToken`	string (secret)	`""`	Leave empty — injected automatically on the Apify platform. Only set for local runs outside the platform. (Advanced)

Output example

{
  "source": "ai_jobs_net",
  "id": "200475",
  "title": "Machine Learning Engineer",
  "company": "Hugging Face",
  "location": "Remote",
  "url": "https://aijobs.net/job/machine-learning-engineer-remote-200475/",
  "postedAt": "2026-06-28",
  "snippet": "We're hiring an ML engineer to work on...",
  "description": "We're hiring an ML engineer to work on our open-source model tooling. You will...",
  "salary": "$140,000–$200,000/yr",
  "salaryMin": 140000,
  "salaryMax": 200000,
  "salaryCurrency": "USD",
  "salaryPeriod": "year",
  "isRemote": true,
  "remoteType": "remote",
  "seniority": "Senior",
  "employmentType": "Full-time"
}

Every record from every board has this same shape. Fields the source doesn't provide come back as "" (strings) or null (numbers/booleans) — e.g. linkedin and hackernews rows have empty salary fields.

Pricing

Pay per event: $0.01 per Actor start and $0.003 per job returned ($3 per 1,000 jobs). That is the whole bill — every source runs inside this Actor, so there are no child-actor fees on top.

Zero-config run estimate (defaults, all 8 sources): up to ~288 merged items for roughly $0.88 all-in ($0.01 start + 288 × $0.003). Real runs usually cost less — not every board returns the full cap, cross-board duplicates are billed once, and delta mode never re-bills a listing you already received.

Use cases

AI-specialist job boards
ML-engineer alert bots
AI-talent market research
Recruiting pipelines for data/ML teams

FAQ

Is it legal to scrape machine learning jobs? This Actor reads only publicly available job postings — data any visitor can see without logging in. No personal data behind authentication is touched. Review the target site's terms and your local regulations for your specific use case.

Do I need an account on the target site? No. Postings are fetched from public pages/APIs — no login, cookies or session tokens.

How fresh is the data? Every run fetches live listings. Results are cached for cacheTtlSeconds (default 30 min, set 0 to always hit the source live).

How many jobs can I get? maxItems caps the run (set 0 for no cap). Most sources paginate from newest to oldest.

Something broken or missing? Open an issue on the Actor's Issues tab — it is monitored and reliability fixes ship fast.

Integrations

Export the dataset as JSON, CSV or Excel, or read it straight from the Apify API. Works out of the box with Make, Zapier and n8n via their Apify integrations, can be called synchronously with run-sync-get-dataset-items from any backend, and is usable by AI agents through the Apify MCP server.

Is this Actor useful to you? A quick ⭐ review on the Actor's Reviews tab helps other AI/ML and developer job seekers find it — and tells us what to build next.

From the maker of Oink — an open-source, AI-powered job-search bot for Telegram that runs on these Actors. Try the free bot, get a managed instance at oinkjobsearch.com, or browse the full catalog of 50+ Actors.

Web Developer Jobs Scraper — LinkedIn, HN, RemoteOK + 7 More

nomad-agent/web-dev-bundle

Scrape developer jobs from 10 boards in one ~90s run: LinkedIn, RemoteOK, Remotive, WeWorkRemotely, Built In, JustJoin.it, NoFluffJobs, Hacker News, Y Combinator, WTTJ, InfoJobs, Tecnoempleo. URL-deduped, with structured salary, remote, seniority and skills. One bill, no child-actor fees.

Nomad.Dev

American Jobs Scraper — 6 Sources in One

nomad-agent/american-jobs-bundle

US job sources behind one endpoint: LinkedIn United States, AI Jobs, Built In, RemoteOK/Remotive/WeWorkRemotely/Himalayas, Hacker News Who Is Hiring and Y Combinator Work at a Startup. One merged, deduped dataset.

Nomad.Dev

Aijobs.net AI & ML Job Listings Scraper

jungle_synthesizer/aijobs-net-ai-engineer-jobs-scraper

Scrape AI, ML, and data science job listings from aijobs.net — the go-to AI/ML job board. Extracts full job details including salary range, seniority, remote policy, tech stack tags, company info, and apply URL. Sitemap-driven for complete coverage.

BowTiedRaccoon

AI Jobs Scraper (aijobs.net) — ML & Data Roles

nomad-agent/ai-jobs-net-scraper

Scrape AI, machine learning and data science jobs from aijobs.net: ML engineer, data scientist, MLOps, research scientist, more. Each record has title, company, location, remote flag, seniority, salary band, posted date and apply URL. Filter by keyword and location; company + description included.

Nomad.Dev

All Jobs Scraper — 19 Job Boards in One

nomad-agent/all-jobs-scraper

19 job boards behind one endpoint, no API keys: LinkedIn, AI Jobs, Built In, remote boards, Hacker News, YC, WTTJ, JustJoin.IT, No Fluff Jobs, InfoJobs, Tecnoempleo, EURES, EURAXESS, jobs.ac.uk, Ikerbasque, UN Careers, ReliefWeb, Impactpool and more. One merged, deduped dataset.

Nomad.Dev

Remote Jobs Scraper — 8 Boards incl. JustJoin & NoFluffJobs

charliemorrisondev/remote-jobs-aggregator

Scrape remote jobs from 8 public boards in one run — RemoteOK, Remotive, Himalayas, Arbeitnow, Jobicy, Working Nomads, JustJoin.it & NoFluffJobs (last two add salary-rich EU/Poland tech roles). No login, no anti-bot. Unified JSON: title, company, salary, tags, seniority, apply URL, ISO date.

Petro Pankov

Remote Jobs Scraper $2/1K — RemoteOK, Remotive, WWR, Himalayas

nomad-agent/remote-boards-scraper

4 remote job boards in one run: RemoteOK, Remotive, WeWorkRemotely, Himalayas. $2/1,000 jobs — cheapest multi-board aggregator. Deduplicated JSON with title, company, salary, apply URL. Delta mode bills each posting once; webhook delivery built in. Made for job-alert bots and hiring research.

Nomad.Dev

Remote Jobs Aggregator (Multi-Board)

technicaldost/remote-jobs-aggregator

Aggregate remote job listings from Remotive, RemoteOK and WeWorkRemotely into one structured feed. Title, company, tags, salary, location and link.

Technical Dost Solutions

Remote Jobs Aggregator - Remotive, RemoteOK, Himalayas & more

get_anything/remote-jobs-aggregator

Search six remote-job boards in one run: Remotive, RemoteOK, Arbeitnow, Himalayas, Jobicy and WeWorkRemotely. Deduplicated, normalized, from official public feeds. No login.

Get Anything

5.0

aijobs.net AI/ML Jobs Scraper - Remote & Onsite

parseforge/aijobs-net-scraper

Gather active job listings from Aijobs Net with title, company, location, remote flag, posted date, salary when published and the direct apply link. Loved by recruiters, agencies, aggregator sites and job hunters. Run on demand or on a recurring schedule and feed every row into your favourite ana.