AI & ML Engineer Jobs Scraper — 8 Boards in One
Pricing
Pay per event
AI & ML Engineer Jobs Scraper — 8 Boards in One
Every strong AI/ML job source behind one endpoint: aijobs.net, LinkedIn, Hacker News Who-is-Hiring, Y Combinator, Built In, RemoteOK/Remotive/WeWorkRemotely, WTTJ and JustJoin.it. One run returns a merged, URL-deduped dataset of live ML, AI and data roles.
One call, eight sources. Fans out to 8 job-source actors tuned for AI/ML roles, merges and dedupes into a single dataset.
What machine learning jobs data does this scraper extract?
Each result is one flat JSON record per job posting:
| Field | Meaning |
|---|---|
title | Job title as posted |
company | Hiring company / organisation |
location | Location / duty station (may include remote hints) |
url | Direct link to the posting |
postedAt | Posting date where the source provides it |
salary | Salary text where the source provides it |
snippet | Short description excerpt |
id | Stable source-side identifier |
How the bundle works
This is a bundle Actor: one endpoint that fans out to the individual job-source Actors listed below, runs them concurrently, maps every record onto one flat schema and dedupes by URL across boards. You can restrict the run to a subset with the sources input. Each child source is charged its own pay-per-event pricing on top of this bundle's — that is the cost of one-call breadth.
How to scrape machine learning jobs with this Actor
- Click Try for free / Run — no login to the target site, no cookies, no proxies to configure.
- Adjust the input (keyword, filters,
maxItems) or keep the defaults. - Run it and export the dataset as JSON, CSV or Excel, or read it over the API.
Run it from your own code:
from apify_client import ApifyClientclient = ApifyClient("<YOUR_APIFY_TOKEN>")run = client.actor("nomad-jobs/ml-ai-dev-bundle").call(run_input={"maxItems": 50})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["title"], "—", item["company"], item["url"])
Or a single HTTP call that runs the Actor and returns items in one response:
curl -X POST \"https://api.apify.com/v2/acts/nomad-jobs~ml-ai-dev-bundle/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \-H "Content-Type: application/json" \-d '{"maxItems": 50}'
Input
| Field | Type | Default | Notes |
|---|---|---|---|
sources | array | ["linkedin", "ai_jobs_net", "hackernews", "ycombinator_was", "builtin", "remote_boards", "wttj", "justjoinit"] | Which boards to include. Leave empty to query the full bundle. |
keyword | string | "" | Optional free-text filter forwarded to children that support it (others ignore it). |
maxItemsPerSource | integer | 36 | Cap on items fetched from EACH child board before merge. |
maxItems | integer | 0 | Hard cap on the merged, deduped output. 0 = no cap. |
cacheTtlSeconds | integer | 1800 | Forwarded to children: cache upstream fetch for this long. 0 disables. |
concurrency | integer | 6 | How many child boards to run in parallel. |
runTimeoutSecs | integer | 120 | Server-side cap on each child actor run. |
apifyToken | string | "" | Only needed for LOCAL runs. On the platform the token is injected automatically. |
Output example
{"source": "ai_jobs_net","id": "200475","title": "Machine Learning Engineer","company": "Hugging Face","location": "Remote","url": "https://aijobs.net/job/machine-learning-engineer-remote-200475/","postedAt": "2026-06-28","salary": "$140K–$200K"}
Pricing
Pay per event: $0.05 per Actor start and $0.004 per job returned. 100 jobs ≈ $0.45. No subscription, no rental — you pay only for what you fetch.
Use cases
- AI-specialist job boards
- ML-engineer alert bots
- AI-talent market research
- Recruiting pipelines for data/ML teams
FAQ
Is it legal to scrape machine learning jobs? This Actor reads only publicly available job postings — data any visitor can see without logging in. No personal data behind authentication is touched. Review the target site's terms and your local regulations for your specific use case.
Do I need an account on the target site? No. Postings are fetched from public pages/APIs — no login, cookies or session tokens.
How fresh is the data?
Every run fetches live listings. Results are cached for cacheTtlSeconds (default 30 min, set 0 to always hit the source live).
How many jobs can I get?
maxItems caps the run (set 0 where supported for no cap). Most sources paginate from newest to oldest.
Something broken or missing? Open an issue on the Actor's Issues tab — it is monitored and reliability fixes ship fast.
Related Actors
- AI Jobs Scraper (aijobs.net) — ML & Data Roles
- LinkedIn Jobs Scraper — No Login, No Cookies
- Hacker News Who Is Hiring Scraper — HN Jobs
- Y Combinator Jobs Scraper — Work at a Startup
- Built In Jobs Scraper — US Tech & Startup Jobs
- Remote Jobs Scraper — RemoteOK Remotive WWR
- Welcome to the Jungle Jobs Scraper (WTTJ)
- JustJoin.it Jobs Scraper — Polish Tech & IT Jobs