# Arbeitsagentur Job Scraper — Salary, Contact & Details (`haketa/arbeitsagentur-scraper`) Actor

Scrape Germany's largest job board (arbeitsagentur.de). Extracts salary amount, employer email/phone, multi-location, home-office, occupation codes and full descriptions from the v6 API. Richer data than any competitor — no proxy or login needed.

- **URL**: https://apify.com/haketa/arbeitsagentur-scraper.md
- **Developed by:** [Haketa](https://apify.com/haketa) (community)
- **Categories:** Jobs, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.30 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Arbeitsagentur Jobs Scraper — Salary, Employer Contact & Full Job Details

Germany's largest job board at your fingertips. This actor scrapes **Bundesagentur für Arbeit** (arbeitsagentur.de) — the official German Federal Employment Agency portal — and returns structured, enriched job listings with **salary data, employer contact information, multi-location support, occupation classification, and full job descriptions**. Built on the **v6 REST API** for richer data than any competitor: exact salaries, home-office flags, career-changer suitability, and detailed work-time breakdowns.

No proxy. No browser. No login. The underlying API is publicly accessible — we just structure the data for you.

---

### Why This Actor?

#### Richer Than the Competition

Most arbeitsagentur.de scrapers use the older v4 API or basic HTML parsing. We use the **v6 search API + v4 detail enrichment**, giving you fields that no other actor extracts:

- **`salaryAmount`** — exact salary figure (e.g. `13.90`) from the API's `festgehalt` field, not just text
- **`salaryPeriod`** — `STUNDENLOHN` / `MONATSLOHN` / `JAHRESLOHN` (hourly / monthly / yearly)
- **`salaryType`** — `FESTGEHALT` (fixed) vs `VERHANDLUNGSBASIS` (negotiable)
- **`isHomeOffice`** — dedicated remote-work boolean from the API
- **`careerChangeSuitable`** — `quereinstiegGeeignet` flag for career changers
- **`locations`** — full JSON array of all job locations, not just the first one
- **`contractDuration`** — `BEFRISTET` (fixed-term) / `UNBEFRISTET` (permanent)
- **Detailed work-time flags** — full-time, part-time (morning/afternoon/evening/flex split), shift/night/weekend
- **`allOccupations`** — all matched occupation classifications, not just the primary

#### Contact & Lead Generation

When detail enrichment is enabled (`includeDetails: true`), we mine each job description for:

- **Email addresses** — extracted with regex from the full description
- **Phone numbers** — German landline and mobile patterns (+49, 0049, 0...)
- **External URLs** — company career pages, application portals

#### Requirements Parsing

We automatically detect:

- **Education level** — PhD/Doctorate, Master/Diplom, Bachelor, Ausbildung (vocational), Quereinstieg
- **Language requirements** — Deutsch and/or Englisch with CEFR levels (A1–C2)

#### Multi-Location Jobs

Some jobs are at multiple sites. We output a full `locations` JSON array with city, postal code, street, region, latitude, and longitude for each site — not just the primary location like other actors.

---

### Data Fields

Every job record includes these fields:

| Field | Description | Source |
|-------|-------------|--------|
| `referenceNumber` | BA reference number (e.g. `12016-10004701286-S`) | v6 API |
| `title` | Job title in German | v6 API |
| `occupation` | Primary occupation classification (Hauptberuf) | v6 API |
| `allOccupations` | All matched occupation codes | v6 API |
| `offerType` | `ARBEIT` / `AUSBILDUNG` / `SELBSTAENDIGKEIT` / `PRAKTIKUM` | v6 API |
| `employer` | Company / organization name | v6 API |
| `employerHash` | Hash for fetching the employer logo | v6 API |
| `employerLogo` | Logo image URL (if available) | logo endpoint |
| `isPrivateAgency` | Posted by a private recruitment agency | detail API |
| `isTempStaffing` | Temporary staffing / Arbeitnehmerüberlassung | detail API |
| `allianzPartner` | Partner network name | detail API |
| `allianzPartnerUrl` | Partner network URL | detail API |
| `locations` | JSON array of all job sites (city, plz, street, region, lat, lng) | detail API |
| `city` | Primary city | v6 API |
| `postalCode` | Primary postal code (PLZ) | v6 API |
| `street` | Primary street address | v6 API |
| `region` | Federal state / region | v6 API |
| `country` | Country (`DEUTSCHLAND`) | v6 API |
| `latitude` | GPS latitude | v6 API |
| `longitude` | GPS longitude | v6 API |
| `distanceKm` | Distance from search location | v6 API |
| `salary` | Formatted salary display (e.g. `13,90 €/Std.`) | built |
| `salaryAmount` | Exact salary figure from API | v6 API |
| `salaryType` | `FESTGEHALT` / `VERHANDLUNGSBASIS` | v6 API |
| `salaryPeriod` | `STUNDENLOHN` / `MONATSLOHN` / `JAHRESLOHN` | v6 API |
| `isFullTime` | Full-time position | v6 API |
| `isPartTime` | Part-time options available | v6 API |
| `isHomeOffice` | Remote / home-office possible | v6 API |
| `isMiniJob` | Mini-job (geringfügige Beschäftigung) | v6 API |
| `isShiftWork` | Shift / night / weekend work | v6 API |
| `contractDuration` | `BEFRISTET` / `UNBEFRISTET` | v6 API |
| `careerChangeSuitable` | Suitable for career changers (Quereinsteiger) | v6 API |
| `disabilityFriendly` | Suitable for people with disabilities | detail API |
| `datePosted` | First publication date (YYYY-MM-DD) | v6 API |
| `dateModified` | Last modification timestamp (ISO 8601) | v6 API |
| `startDate` | Earliest start date (YYYY-MM-DD) | v6 API |
| `description` | Full job description (text) | detail API |
| `extractedEmail` | Email found in description | extracted |
| `extractedPhone` | Phone number found in description | extracted |
| `extractedUrl` | External URL found in description | extracted |
| `educationRequired` | Education level (PhD/Master/Bachelor/Ausbildung) | extracted |
| `languageRequirements` | Language requirements (Deutsch/Englisch + CEFR level) | extracted |
| `externalUrl` | External application URL (externeURL) | v6 API |
| `portalUrl` | Job page on arbeitsagentur.de | built |
| `searchKeyword` | Keyword used for this search | input |
| `searchLocation` | Location used for this search | input |
| `scrapedAt` | ISO scrape timestamp | runtime |

---

### Input Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `keyword` | string | — | Job title or keyword, e.g. `"Softwareentwickler"`, `"Pflegefachkraft"`, `"Marketing Manager"` |
| `location` | string | — | City, region or postal code, e.g. `"Berlin"`, `"München"`, `"10115"` |
| `radius` | integer | 25 | Search radius in km (0–200). 0 = exact location match only |
| `employer` | string | — | Search by employer name, e.g. `"Siemens AG"`, `"Deutsche Bahn"` |
| `startUrls` | array | — | Paste arbeitsagentur.de search or detail URLs |
| `angebotsart` | select | All | Offer type: Arbeit / Selbstständigkeit / Ausbildung / Praktikum |
| `arbeitszeit` | select | All | Working time: Vollzeit / Teilzeit / Homeoffice / Minijob / Schicht |
| `befristung` | select | All | Contract: Befristet (fixed-term) / Unbefristet (permanent) |
| `behinderung` | boolean | false | Disability-friendly jobs only |
| `publishedSince` | integer | 0 | Only jobs from last N days (1 = today, 7 = week, 30 = month). 0 = all |
| `includeDetails` | boolean | true | Fetch full description, multi-location, contacts. Turn off for speed |
| `maxItems` | integer | 100 | Max jobs to scrape. 0 = no limit |
| `maxPagesPerSearch` | integer | 10 | Max result pages per query (100 jobs/page). 0 = no limit |
| `requestDelay` | integer | 100 | Delay between API requests in ms |
| `maxConcurrency` | integer | 10 | Parallel detail requests |
| `language` | select | original | Output language: original German or German + English |

#### Search Modes

**1. Keyword + Location (default):**
```json
{
    "keyword": "Softwareentwickler",
    "location": "Berlin",
    "radius": 25
}
````

**2. Employer-focused:**

```json
{
    "employer": "Siemens AG",
    "location": "München"
}
```

**3. Start URLs (advanced filters):**

```json
{
    "startUrls": [
        "https://www.arbeitsagentur.de/jobsuche/suche?was=Ingenieur&wo=München&umkreis=50&arbeitszeit=vz"
    ]
}
```

**4. Direct job detail URLs:**

```json
{
    "startUrls": [
        "https://www.arbeitsagentur.de/jobsuche/jobdetail/12016-10004701286-S"
    ]
}
```

***

### Use Cases

#### 1. Recruitment & Talent Sourcing

Identify active hiring companies, extract HR contact emails and phone numbers, and build targeted outreach lists. The `employer` search mode lets you monitor specific companies' hiring patterns.

#### 2. Labour Market Intelligence

Track demand by occupation (`occupation` field), region, and time. Analyze which `region` (Bundesland) is hiring for which roles. Use `publishedSince` to see only fresh postings.

#### 3. Salary Benchmarking

The `salaryAmount` + `salaryPeriod` fields give you exact, structured salary data for compensation benchmarking across occupations and regions. Build salary heatmaps with `latitude`/`longitude` coordinates.

#### 4. Lead Generation for B2B Services

Companies posting jobs are growing — they need office space, IT services, recruitment software, training, and more. Extract `employer` names, locations, and `extractedEmail`/`extractedPhone` for sales prospecting.

#### 5. Competitive Intelligence

Monitor which competitors are hiring, for which roles, and in which cities. The `employer` search parameter makes this trivial.

#### 6. Academic & Policy Research

The Bundesagentur data is the most comprehensive source of German labour market information. Export structured data for economic research, workforce planning, and policy analysis.

#### 7. Job Board / Aggregator

Feed structured job data into your own job board, search engine, or analytics dashboard. Clean, consistent field names across all records.

#### 8. AI / ML Training Data

Use the `description` field (when `includeDetails: true`) and structured metadata for training NLP models on job classification, skill extraction, or salary prediction.

***

### Example Output

```json
{
    "referenceNumber": "12016-10004701286-S",
    "title": "Softwareentwickler/in - Medienbüro!",
    "occupation": "Softwareentwickler/in",
    "allOccupations": "Softwareentwickler/in",
    "offerType": "ARBEIT",
    "employer": "PerZukunft Arbeitsvermittlung GmbH&Co.KG",
    "employerHash": "K-odMSiWh6Flr85j5gueeE_9FFhpOtsHIPbrBKoeCCs=",
    "employerLogo": "https://rest.arbeitsagentur.de/jobboerse/jobsuche-service/ct/v1/arbeitgeberlogo/K-odMSiWh6Flr85j5gueeE_9FFhpOtsHIPbrBKoeCCs%3D",
    "isPrivateAgency": "true",
    "isTempStaffing": "false",
    "locations": "[{\"city\":\"Berlin\",\"postalCode\":\"12167\",\"region\":\"BERLIN\",\"lat\":\"52.449107911001\",\"lng\":\"13.333178738\"},{\"city\":\"Berlin\",\"postalCode\":\"13407\",\"region\":\"BERLIN\",\"lat\":\"52.571956228\",\"lng\":\"13.351067072\"}]",
    "city": "Berlin",
    "postalCode": "10249",
    "region": "BERLIN",
    "country": "DEUTSCHLAND",
    "latitude": "52.523643977",
    "longitude": "13.445373265",
    "distanceKm": "2",
    "salary": "13,90 €/Std.",
    "salaryAmount": "13.9",
    "salaryType": "FESTGEHALT",
    "salaryPeriod": "STUNDENLOHN",
    "isFullTime": "true",
    "isPartTime": "false",
    "isHomeOffice": "false",
    "isMiniJob": "false",
    "isShiftWork": "false",
    "contractDuration": "KEINE_ANGABE",
    "careerChangeSuitable": "false",
    "disabilityFriendly": "false",
    "datePosted": "2026-06-21",
    "dateModified": "2026-06-21T07:11:46.160",
    "startDate": "2026-06-22",
    "description": "Zum nächstmöglichen Zeitpunkt suchen wir für ein Berliner Medienbüro...",
    "extractedEmail": "steglitz.it@perzukunft.de",
    "extractedPhone": "+49 30 2065800",
    "extractedUrl": "https://www.perzukunft.de/job/softwareentwickler-in-medienburo-1201610004701286",
    "educationRequired": "Ausbildung (vocational); University degree (unspecified)",
    "languageRequirements": "Deutsch: required",
    "externalUrl": null,
    "portalUrl": "https://www.arbeitsagentur.de/jobsuche/jobdetail/12016-10004701286-S",
    "searchKeyword": "Softwareentwickler",
    "searchLocation": "Berlin",
    "scrapedAt": "2026-06-21T10:30:00.000Z"
}
```

***

### Performance

| Mode | Speed | Memory | Notes |
|------|-------|--------|-------|
| Search only (`includeDetails: false`) | ~80 jobs/sec | ~64 MB | Pure v6 API, no detail calls |
| With details (`includeDetails: true`) | ~15–25 jobs/sec | ~128 MB | One detail API call per job |
| Direct URLs | ~15–25 jobs/sec | ~128 MB | Always fetches full details |

The public API has no documented rate limit. We use a polite 100 ms default delay between requests. You can lower `requestDelay` to 0 for maximum speed.

**Typical costs**: Scraping 100 jobs with full details takes ~10–15 seconds and costs a fraction of a CU.

***

### Integration

#### JavaScript / TypeScript

```javascript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('haketa/arbeitsagentur-scraper').call({
    keyword: 'Softwareentwickler',
    location: 'Berlin',
    maxItems: 100,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
```

#### Python

```python
from apify_client import ApifyClient
client = ApifyClient(token='YOUR_TOKEN')
run = client.actor('haketa/arbeitsagentur-scraper').call(run_input={
    'keyword': 'Softwareentwickler',
    'location': 'Berlin',
    'maxItems': 100,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
```

#### CLI

```bash
apify call haketa/arbeitsagentur-scraper \
    --input '{"keyword":"Softwareentwickler","location":"Berlin","maxItems":100}'
```

#### MCP (AI Agents)

```
https://mcp.apify.com?tools=haketa/arbeitsagentur-scraper
```

***

### Tips & Best Practices

- **Start small**: Set `maxItems` to 50–100 for a first test. The job board has 500K–800K active listings — a `maxItems: 0` unbounded search can run for hours.
- **Use `publishedSince`**: Combine with `publishedSince: 7` for weekly fresh-job monitoring. Much faster than full searches.
- **Employer monitoring**: Use the `employer` field to track specific companies. No keyword needed.
- **Detail enrichment tradeoff**: Set `includeDetails: false` for fast bulk list extraction (title, employer, location, salary). Toggle on when you need descriptions, contacts, and requirements.
- **Multi-keyword**: Use `startUrls` with multiple pre-filtered arbeitsagentur.de search URLs for complex multi-keyword runs.
- **Scheduling**: Set up recurring runs (daily/weekly) for continuous job market monitoring.

***

### Data Source

This actor extracts publicly available job listings from the **Bundesagentur für Arbeit** (German Federal Employment Agency) at arbeitsagentur.de. The data is accessed through the same public API that powers the official job search interface. No authentication, proxy, or login is required.

### Legal & Responsible Use

This actor retrieves publicly accessible job posting data. Users are responsible for complying with applicable laws and regulations, including the German Federal Data Protection Act (BDSG) and GDPR where applicable. We recommend:

- Using the data for legitimate business, research, or personal purposes
- Respecting the API's availability by using reasonable request delays
- Not redistributing the data in violation of the source site's terms of service

***

### About the Bundesagentur Job Board

The **Bundesagentur für Arbeit** (BA) operates Germany's largest and most authoritative job portal at arbeitsagentur.de. Key facts:

- **500K–800K** active job listings at any time
- **100% of Germany** covered — all 16 federal states, urban and rural
- All industries: IT, engineering, healthcare, trades, logistics, finance, education & more
- All employment types: full-time, part-time, mini-job, remote, shift, freelance, training
- Official data: listings are validated through Germany's employment agency system
- Updated continuously: new jobs added and filled positions removed in near real-time

***

### FAQ

**Q: Do I need a proxy?**
A: No. The API is publicly accessible from any IP address.

**Q: Is there a rate limit?**
A: No documented rate limit. We default to 100 ms between requests as a courtesy.

**Q: How many jobs can I scrape?**
A: The total active catalogue is 500K–800K listings. Set `maxItems` to control your run size.

**Q: Does this include jobs from Austria or Switzerland?**
A: The BA portal primarily covers Germany. Some cross-border listings may appear near the Austrian and Swiss borders, but this is not guaranteed.

**Q: Are salaries always available?**
A: No. German law does not require salary disclosure in job ads. The `salaryAmount` field is populated when the employer voluntarily includes a figure in the listing. Salaries appear more frequently for hourly/tariff-bound roles (e.g. nursing, trades, service).

**Q: What's the difference between this and other arbeitsagentur.de actors?**
A: We use the v6 API which provides exact salary amounts, home-office flags, career-changer suitability, multi-location data, and detailed work-time breakdowns. We also extract email, phone, URLs, education level, and language requirements from full descriptions. No other actor combines all of these.

**Q: Can I search by occupation code (Berufsnummer)?**
A: The API supports the `berufsfeld` parameter. Use `startUrls` with a manually constructed URL to pass this filter.

***

*Made for recruiters, researchers, and businesses who need Germany's most complete job market data in structured form.*

# Actor input Schema

## `keyword` (type: `string`):

What to search for, e.g. 'Softwareentwickler', 'Pflegefachkraft', 'Marketing Manager'. Leave empty to search all jobs in a location.

## `location` (type: `string`):

City, region or postal code, e.g. 'Berlin', 'München', 'Hamburg'. Leave empty for nationwide search.

## `radius` (type: `integer`):

Search radius around the location in km. 0 = exact location match only.

## `employer` (type: `string`):

Search for jobs from a specific employer, e.g. 'Siemens AG', 'Deutsche Bahn'. Leave empty for general keyword/location search.

## `startUrls` (type: `array`):

Paste arbeitsagentur.de search-result URLs (with your own filters) or individual job detail URLs. Used instead of / alongside keyword search.

## `angebotsart` (type: `string`):

Type of job offer to search for.

## `arbeitszeit` (type: `string`):

Filter by working-time model. Leave empty for all.

## `befristung` (type: `string`):

Filter by contract type.

## `behinderung` (type: `boolean`):

Only return jobs suitable for people with disabilities.

## `publishedSince` (type: `integer`):

Only jobs published in the last N days. 0 = all time, 1 = today, 7 = last week, 30 = last month.

## `includeDetails` (type: `boolean`):

Fetch the full job description, multi-location data, contact info and employer profile for each listing. Slower but far richer — enables email/phone extraction and requirements parsing.

## `maxItems` (type: `integer`):

Maximum number of jobs to scrape. 0 = no limit (careful — can be 100K+).

## `maxPagesPerSearch` (type: `integer`):

Maximum result pages per search query (each page = up to 100 jobs). 0 = no limit.

## `requestDelay` (type: `integer`):

Delay between API requests in milliseconds. The API has no strict rate-limit but be polite.

## `maxConcurrency` (type: `integer`):

Maximum parallel detail-page requests. Keep moderate.

## `language` (type: `string`):

Keep original German field values or add English translations for title and occupation.

## Actor input object example

```json
{
  "keyword": "Softwareentwickler",
  "location": "Berlin",
  "radius": 25,
  "startUrls": [],
  "angebotsart": "",
  "arbeitszeit": "",
  "befristung": "",
  "behinderung": false,
  "publishedSince": 0,
  "includeDetails": true,
  "maxItems": 100,
  "maxPagesPerSearch": 10,
  "requestDelay": 100,
  "maxConcurrency": 10,
  "language": "original"
}
```

# Actor output Schema

## `referenceNumber` (type: `string`):

BA reference number (refnr)

## `title` (type: `string`):

Job title (German)

## `titleEn` (type: `string`):

Job title (English translation)

## `occupation` (type: `string`):

Primary occupation classification (Hauptberuf)

## `allOccupations` (type: `string`):

All matched occupation codes

## `offerType` (type: `string`):

ARBEIT / AUSBILDUNG / SELBSTAENDIGKEIT / PRAKTIKUM

## `employer` (type: `string`):

Company / organization name

## `employerHash` (type: `string`):

Employer logo hash (for logo URL)

## `employerLogo` (type: `string`):

Employer logo URL (if available)

## `isPrivateAgency` (type: `string`):

Posted by a private recruitment agency

## `isTempStaffing` (type: `string`):

Posted by a temporary staffing agency

## `allianzPartner` (type: `string`):

Partner network name (if any)

## `allianzPartnerUrl` (type: `string`):

Partner network URL

## `locations` (type: `string`):

All job locations (JSON array)

## `city` (type: `string`):

Primary city

## `postalCode` (type: `string`):

Primary postal code (PLZ)

## `street` (type: `string`):

Primary street address

## `region` (type: `string`):

Federal state / region (Bundesland)

## `country` (type: `string`):

Country (DEUTSCHLAND)

## `latitude` (type: `string`):

GPS latitude

## `longitude` (type: `string`):

GPS longitude

## `distanceKm` (type: `string`):

Distance from search location

## `salary` (type: `string`):

Salary display string

## `salaryAmount` (type: `string`):

Exact salary amount (festgehalt from v6 API)

## `salaryType` (type: `string`):

FESTGEHALT / VERHANDLUNGSBASIS

## `salaryPeriod` (type: `string`):

STUNDENLOHN / MONATSLOHN / JAHRESLOHN

## `isFullTime` (type: `string`):

Full-time position

## `isPartTime` (type: `string`):

Part-time position available

## `isHomeOffice` (type: `string`):

Remote / home-office possible

## `isMiniJob` (type: `string`):

Mini-job (geringfügige Beschäftigung)

## `isShiftWork` (type: `string`):

Shift / night / weekend work

## `contractDuration` (type: `string`):

BEFRISTET / UNBEFRISTET / KEINE\_ANGABE

## `careerChangeSuitable` (type: `string`):

Suitable for career changers (Quereinsteiger)

## `disabilityFriendly` (type: `string`):

Suitable for people with disabilities

## `datePosted` (type: `string`):

First publication date (YYYY-MM-DD)

## `dateModified` (type: `string`):

Last modification timestamp (ISO)

## `startDate` (type: `string`):

Earliest start date (YYYY-MM-DD)

## `description` (type: `string`):

Full job description (text, from detail page)

## `extractedEmail` (type: `string`):

Email found in description

## `extractedPhone` (type: `string`):

Phone number found in description

## `extractedUrl` (type: `string`):

External URL found in description

## `educationRequired` (type: `string`):

Education level from description (Ausbildung/Studium/Bachelor/Master)

## `languageRequirements` (type: `string`):

Language requirements from description (Deutsch/Englisch + level)

## `externalUrl` (type: `string`):

External application URL (externeURL)

## `portalUrl` (type: `string`):

Job page on arbeitsagentur.de

## `searchKeyword` (type: `string`):

Keyword used for this search

## `searchLocation` (type: `string`):

Location used for this search

## `scrapedAt` (type: `string`):

ISO scrape timestamp

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keyword": "Softwareentwickler",
    "location": "Berlin",
    "startUrls": [],
    "maxItems": 100
};

// Run the Actor and wait for it to finish
const run = await client.actor("haketa/arbeitsagentur-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "keyword": "Softwareentwickler",
    "location": "Berlin",
    "startUrls": [],
    "maxItems": 100,
}

# Run the Actor and wait for it to finish
run = client.actor("haketa/arbeitsagentur-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keyword": "Softwareentwickler",
  "location": "Berlin",
  "startUrls": [],
  "maxItems": 100
}' |
apify call haketa/arbeitsagentur-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=haketa/arbeitsagentur-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Arbeitsagentur Job Scraper — Salary, Contact & Details",
        "description": "Scrape Germany's largest job board (arbeitsagentur.de). Extracts salary amount, employer email/phone, multi-location, home-office, occupation codes and full descriptions from the v6 API. Richer data than any competitor — no proxy or login needed.",
        "version": "0.1",
        "x-build-id": "pu8ldmdFadHuRmBgX"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/haketa~arbeitsagentur-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-haketa-arbeitsagentur-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/haketa~arbeitsagentur-scraper/runs": {
            "post": {
                "operationId": "runs-sync-haketa-arbeitsagentur-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/haketa~arbeitsagentur-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-haketa-arbeitsagentur-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "keyword": {
                        "title": "Keyword / job title",
                        "type": "string",
                        "description": "What to search for, e.g. 'Softwareentwickler', 'Pflegefachkraft', 'Marketing Manager'. Leave empty to search all jobs in a location."
                    },
                    "location": {
                        "title": "Location",
                        "type": "string",
                        "description": "City, region or postal code, e.g. 'Berlin', 'München', 'Hamburg'. Leave empty for nationwide search."
                    },
                    "radius": {
                        "title": "Radius (km)",
                        "minimum": 0,
                        "maximum": 200,
                        "type": "integer",
                        "description": "Search radius around the location in km. 0 = exact location match only.",
                        "default": 25
                    },
                    "employer": {
                        "title": "Employer name",
                        "type": "string",
                        "description": "Search for jobs from a specific employer, e.g. 'Siemens AG', 'Deutsche Bahn'. Leave empty for general keyword/location search."
                    },
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Paste arbeitsagentur.de search-result URLs (with your own filters) or individual job detail URLs. Used instead of / alongside keyword search.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "angebotsart": {
                        "title": "Offer type",
                        "enum": [
                            "",
                            "1",
                            "2",
                            "4",
                            "34"
                        ],
                        "type": "string",
                        "description": "Type of job offer to search for.",
                        "default": ""
                    },
                    "arbeitszeit": {
                        "title": "Working time",
                        "enum": [
                            "",
                            "vz",
                            "tz",
                            "ho",
                            "mj",
                            "snw"
                        ],
                        "type": "string",
                        "description": "Filter by working-time model. Leave empty for all.",
                        "default": ""
                    },
                    "befristung": {
                        "title": "Contract duration",
                        "enum": [
                            "",
                            "1",
                            "2"
                        ],
                        "type": "string",
                        "description": "Filter by contract type.",
                        "default": ""
                    },
                    "behinderung": {
                        "title": "Disability-friendly only",
                        "type": "boolean",
                        "description": "Only return jobs suitable for people with disabilities.",
                        "default": false
                    },
                    "publishedSince": {
                        "title": "Published since (days)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Only jobs published in the last N days. 0 = all time, 1 = today, 7 = last week, 30 = last month.",
                        "default": 0
                    },
                    "includeDetails": {
                        "title": "Include job details",
                        "type": "boolean",
                        "description": "Fetch the full job description, multi-location data, contact info and employer profile for each listing. Slower but far richer — enables email/phone extraction and requirements parsing.",
                        "default": true
                    },
                    "maxItems": {
                        "title": "Max jobs",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of jobs to scrape. 0 = no limit (careful — can be 100K+).",
                        "default": 100
                    },
                    "maxPagesPerSearch": {
                        "title": "Max pages per search",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum result pages per search query (each page = up to 100 jobs). 0 = no limit.",
                        "default": 10
                    },
                    "requestDelay": {
                        "title": "Request delay (ms)",
                        "minimum": 0,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Delay between API requests in milliseconds. The API has no strict rate-limit but be polite.",
                        "default": 100
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 25,
                        "type": "integer",
                        "description": "Maximum parallel detail-page requests. Keep moderate.",
                        "default": 10
                    },
                    "language": {
                        "title": "Output language",
                        "enum": [
                            "original",
                            "both"
                        ],
                        "type": "string",
                        "description": "Keep original German field values or add English translations for title and occupation.",
                        "default": "original"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
