# AI Jobs Scraper (aijobs.net) — ML & Data Roles (`nomad-agent/ai-jobs-net-scraper`) Actor

Extract curated AI and machine-learning vacancies from aijobs.net: ML engineer, data scientist, MLOps, research scientist and more. Records include title, company, location, salary band, seniority and apply URL. The cleanest single source for AI hiring data.

- **URL**: https://apify.com/nomad-agent/ai-jobs-net-scraper.md
- **Developed by:** [Nomad.Dev](https://apify.com/nomad-agent) (community)
- **Categories:** Jobs
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## AI Jobs Scraper (aijobs.net) — ML & Data Roles

Fetch curated AI / ML / data-science openings from aijobs.net with salary bands (where disclosed) and seniority levels.

### What AI jobs data does this scraper extract?

Each result is one flat JSON record per job posting:

| Field | Meaning |
|---|---|
| `id` | Stable source-side identifier |
| `slug` | URL slug segment of the posting |
| `title` | Job title as posted |
| `company` | Hiring company / organisation. **`null` by default** — the listing page never shows it, only each job's own detail page does. Set `includeCompany: true` to fetch it (one extra request per job). |
| `location` | Location / duty station (may include remote hints), or `null` |
| `url` | Direct link to the posting |
| `postedAt` | Relative posting age as shown by the source (e.g. `"5d ago"`), or `null` |
| `seniority` | Experience-level badge from the card: `Entry-level`, `Mid-level`, `Senior-level` or `Executive-level`, or `null` |
| `snippet` | Short description excerpt (from the listing card, ~400 chars) |
| `description` | Fuller JD text (Tasks / Perks / Skills / Education / Roles) assembled from the job's detail page, up to 5000 chars. Only populated when `includeCompany: true` fetches that page (same request, no extra cost); `null` otherwise |
| `salary` | Raw salary-band text, e.g. `"USD 80K-160K"`. Only ~half of postings disclose one; `null` otherwise |
| `salaryMin` / `salaryMax` | Parsed band bounds (whole currency units, e.g. `80000`), or `null` |
| `salaryCurrency` | 3-letter currency code parsed from the badge, or `null` |
| `salaryPeriod` | Pay period (year/month/hour) — always `null`; the source never labels which period the band covers |

### How to scrape AI jobs with this Actor

1. Click **Try for free** / **Run** — no login to the target site, no cookies, no proxies to configure.
2. Adjust the input (`maxItems`, `euBias`, `includeCompany`) or keep the defaults.
3. Run it and export the dataset as JSON, CSV or Excel, or read it over the [API](https://docs.apify.com/api/v2).

Run it from your own code:

```python
from apify_client import ApifyClient

client = ApifyClient("<YOUR_APIFY_TOKEN>")
run = client.actor("nomad-agent/ai-jobs-net-scraper").call(run_input={"maxItems": 50, "includeCompany": True})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["title"], "—", item["company"], item["url"])
````

Or a single HTTP call that runs the Actor and returns items in one response:

```bash
curl -X POST \
  "https://api.apify.com/v2/acts/nomad-agent~ai-jobs-net-scraper/run-sync-get-dataset-items?token=<YOUR_APIFY_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"maxItems": 50}'
```

### Input

| Field | Type | Default | Notes |
|---|---|---|---|
| `maxItems` | integer | `50` | Maximum number of job listings to return. aijobs.net's listing is a single, non-paginated page holding roughly 40-50 cards — this Actor cannot structurally return more than that page has, however high you set this. Set `0` to return every card found. |
| `euBias` | boolean | `false` | When enabled, listings whose location or title hints at a European country or remote-EU are sorted to the top. Non-EU jobs are still returned; they just appear later. |
| `includeCompany` | boolean | `false` | The listing page never shows the company name — only the per-job detail page does. Enable this to fetch that page (one extra request per job) and populate `company` **and** `description`. Off by default, in which case both stay `null`. |
| `cacheTtlSeconds` | integer | `1800` | *(Advanced)* Cache the upstream fetch(es) in the key-value store for this many seconds; re-runs within the window skip the network call. Set `0` to disable. |

### Output example

Default run (`includeCompany: false`):

```json
{
  "id": "200475",
  "slug": "competitive-coder-remote",
  "title": "Competitive Coder",
  "company": null,
  "location": "Remote",
  "url": "https://aijobs.net/job/competitive-coder-remote-200475/",
  "postedAt": "5d ago",
  "seniority": "Entry-level",
  "snippet": "Competitive Coder USD 80K-160K C plus plus | Codeforces | Competitive programming ...",
  "description": null,
  "salary": "USD 80K-160K",
  "salaryMin": 80000,
  "salaryMax": 160000,
  "salaryCurrency": "USD",
  "salaryPeriod": null
}
```

With `includeCompany: true`, `company` and `description` are both populated (e.g. `company: "micro1"`, `description: "Tasks: ... | Perks/Benefits: ... | Skills/Tech-stack: ... | Education: ... | Roles: ..."`) at the cost of one extra HTTP request per job — the same detail-page fetch fills both fields. Roughly half of postings don't disclose a salary band at all; on those, `salary`, `salaryMin`, `salaryMax` and `salaryCurrency` are all `null`.

### Pricing

Pay per event: **$0.05 per Actor start** and **$0.004 per job returned**.
100 jobs ≈ $0.45. No subscription, no rental — you pay only for what you fetch. Enabling `includeCompany` adds an extra request per job but does not change the per-job price.

### Use cases

- AI-specialist job boards and newsletters
- Tracking ML-engineer demand and salaries
- Sourcing AI talent pipelines
- Feeding the ml-ai-dev-bundle with anchor data

### FAQ

**Is it legal to scrape AI jobs?**
This Actor reads only publicly available job postings — data any visitor can see without logging in. No personal data behind authentication is touched. Review the target site's terms and your local regulations for your specific use case.

**Do I need an account on the target site?**
No. Postings are fetched from public pages/APIs — no login, cookies or session tokens.

**How fresh is the data?**
Every run fetches live listings. Results are cached for `cacheTtlSeconds` (default 30 min, set 0 to always hit the source live).

**How many jobs can I get?**
`maxItems` caps the run, up to whatever the aijobs.net homepage currently holds — typically 40-50 listings. This is a single, non-paginated page, so that's the hard ceiling; there's no "page 2" to crawl for more.

**Why is `company` empty?**
By default this Actor doesn't fetch it — the listing page never includes the company name, only each job's own detail page does. Set `includeCompany: true` to fetch it (one extra request per job).

**Something broken or missing?**
Open an issue on the Actor's **Issues** tab — it is monitored and reliability fixes ship fast.

### Integrations

Export the dataset as JSON, CSV, Excel/XLSX, or plug it straight into **Make**, **Zapier** or **n8n** via the Apify integrations. For one-shot pulls, call [`run-sync-get-dataset-items`](https://docs.apify.com/api/v2#/reference/actors/run-actor-synchronously-and-get-dataset-items) to run the Actor and get items back in a single HTTP response, or drive it through the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp) from any MCP-compatible agent.

### Related Actors

- [AI & ML Engineer Jobs Scraper — 8 Boards in One](https://apify.com/nomad-agent/ml-ai-dev-bundle)
- [Hacker News Who Is Hiring Scraper — HN Jobs](https://apify.com/nomad-agent/hackernews-scraper)
- [LinkedIn Jobs Scraper — No Login, No Cookies](https://apify.com/nomad-agent/linkedin-scraper)
- [Built In Jobs Scraper — US Tech & Startup Jobs](https://apify.com/nomad-agent/builtin-scraper)

# Actor input Schema

## `maxItems` (type: `integer`):

Maximum number of job listings to return. aijobs.net's listing page is a single, non-paginated page that holds roughly 40-50 cards — this Actor structurally cannot return more than what that one page has, no matter how high you set this. Set 0 to return every card found on the page.

## `euBias` (type: `boolean`):

When enabled, listings whose location or title hints at a European country or remote-EU are sorted to the top. Non-EU jobs are still returned; they just appear later.

## `includeCompany` (type: `boolean`):

The listing page never shows the company name — it only appears on each job's own detail page. Enabling this fetches that detail page (one extra HTTP request per job) to fill in `company`; it increases run time and request count. Off by default, in which case `company` is returned as null.

## `cacheTtlSeconds` (type: `integer`):

Cache the upstream fetch in the key-value store for this many seconds; re-runs within the window skip the network call. Set 0 to disable.

## Actor input object example

```json
{
  "maxItems": 50,
  "euBias": false,
  "includeCompany": false,
  "cacheTtlSeconds": 1800
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("nomad-agent/ai-jobs-net-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("nomad-agent/ai-jobs-net-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call nomad-agent/ai-jobs-net-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=nomad-agent/ai-jobs-net-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "AI Jobs Scraper (aijobs.net) — ML & Data Roles",
        "description": "Extract curated AI and machine-learning vacancies from aijobs.net: ML engineer, data scientist, MLOps, research scientist and more. Records include title, company, location, salary band, seniority and apply URL. The cleanest single source for AI hiring data.",
        "version": "0.1",
        "x-build-id": "HWENeg1Cl9haCcxuI"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/nomad-agent~ai-jobs-net-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-nomad-agent-ai-jobs-net-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/nomad-agent~ai-jobs-net-scraper/runs": {
            "post": {
                "operationId": "runs-sync-nomad-agent-ai-jobs-net-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/nomad-agent~ai-jobs-net-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-nomad-agent-ai-jobs-net-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 0,
                        "maximum": 60,
                        "type": "integer",
                        "description": "Maximum number of job listings to return. aijobs.net's listing page is a single, non-paginated page that holds roughly 40-50 cards — this Actor structurally cannot return more than what that one page has, no matter how high you set this. Set 0 to return every card found on the page.",
                        "default": 50
                    },
                    "euBias": {
                        "title": "EU / Europe bias",
                        "type": "boolean",
                        "description": "When enabled, listings whose location or title hints at a European country or remote-EU are sorted to the top. Non-EU jobs are still returned; they just appear later.",
                        "default": false
                    },
                    "includeCompany": {
                        "title": "Include company name",
                        "type": "boolean",
                        "description": "The listing page never shows the company name — it only appears on each job's own detail page. Enabling this fetches that detail page (one extra HTTP request per job) to fill in `company`; it increases run time and request count. Off by default, in which case `company` is returned as null.",
                        "default": false
                    },
                    "cacheTtlSeconds": {
                        "title": "Cache TTL (seconds)",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Cache the upstream fetch in the key-value store for this many seconds; re-runs within the window skip the network call. Set 0 to disable.",
                        "default": 1800
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
