Arbeitsagentur.de Scraper - German Federal Job Board
Pricing
from $8.00 / 1,000 job serp results
Arbeitsagentur.de Scraper - German Federal Job Board
Extract jobs from Germany's official employment agency (Bundesagentur für Arbeit). Get job titles, companies, locations, salaries, descriptions & contact details with ML-powered captcha solving. Supports search filters, direct URLs & job status checks.
Pricing
from $8.00 / 1,000 job serp results
Rating
0.0
(0)
Developer

Alessandro Santamaria
Actor stats
1
Bookmarked
19
Total users
9
Monthly active users
a day ago
Last modified
Categories
Share
Arbeitsagentur.de Job Scraper
Scraper for job listings from Arbeitsagentur.de (Bundesagentur fur Arbeit), the German Federal Employment Agency's official job portal - one of the largest job boards in Germany with over 1 million active listings.
Features
- Pure HTTP Architecture: No browser needed -- fast API-based search + HTTP captcha flow (~80MB memory)
- Multi-Query Support: Run multiple search keywords in a single run with automatic deduplication
- ML-Powered Captcha Solving: Extracts contact details (email, phone, contact person) behind the captcha using a trained ONNX neural network (~95% first-attempt accuracy)
- Rich Company Data: Company logo, website (Homepage), description, and "Alle Stellen" link via employer profile API
- Full Job Descriptions: Complete text with structured salary, dates, and remote work flags
- Filter Options: Search by keywords, location, federal state (Bundesland), and employment type
- Direct URL Mode: Check status of specific job listings
- Standardized Output: Compatible with the Santamaria ecosystem
JobListingschema
How It Works
- Search Phase: Uses the public REST API (
v4/jobs) withX-API-Keyauthentication to search and paginate - Job Details API (optional): Fetches rich structured data from
v4/jobdetails-- full description, salary range, dates, employment flags - Employer Profile API: Fetches company description, Homepage link, and logo URL from
ag-darstellung-service - Captcha + Contact API: Requests captcha assignment, solves with ONNX model, submits solution to unlock hidden contact data (email, phone, name)
All steps use pure HTTP -- no browser or Playwright required.
Input
| Field | Type | Description | Default |
|---|---|---|---|
searchQueries | string[] | Multiple search keywords (each runs as separate search, deduplicated) | - |
searchQuery | string | Single search keyword (legacy, backward compatible) | - |
location | string | City or region (Wo) | - |
bundesland | string | Federal state code | All states |
employmentType | string | Type of employment | All types |
maxResultsPerQuery | integer | Maximum results per search keyword | 100 |
maxResults | integer | Total cap across all queries (0 = unlimited) | 0 |
includeJobDetails | boolean | Extract contact details, full description, company data (~3-5s/job) | false |
directUrls | array | Specific job URLs to scrape | - |
proxyConfiguration | object | Proxy settings (datacenter works fine) | Apify proxy |
Bundesland Codes
| Code | State |
|---|---|
BW | Baden-Wurttemberg |
BY | Bayern (Bavaria) |
BE | Berlin |
BB | Brandenburg |
HB | Bremen |
HH | Hamburg |
HE | Hessen |
MV | Mecklenburg-Vorpommern |
NI | Niedersachsen |
NW | Nordrhein-Westfalen |
RP | Rheinland-Pfalz |
SL | Saarland |
SN | Sachsen |
ST | Sachsen-Anhalt |
SH | Schleswig-Holstein |
TH | Thuringen |
Employment Types
| Code | Description |
|---|---|
VOLLZEIT | Full-time |
TEILZEIT | Part-time |
MINIJOB | Mini job |
AUSBILDUNG | Apprenticeship |
PRAKTIKUM | Internship |
FREIBERUFLICH | Freelance |
HEIMARBEIT | Remote work |
Output
Each job listing includes:
{"id": "12016-10004030577-S","title": "Pflegeassistent /in","company": "PerZukunft Arbeitsvermittlung GmbH&Co.KG","location": "10179, Berlin","country": "DE","canton": null,"salary_min": 12.82,"salary_max": 12.82,"salary_currency": "EUR","salary_period": "hourly","salary_text": "12,82 EUR/Std.","employment_type": "full-time","workload_min": null,"workload_max": null,"remote_option": null,"description_snippet": "Fur [mehrere] Standorte des Grossraums Berlin...","description_full": "Full job description with markdown formatting...","requirements": [],"company_benefits": [],"posted_at": "2026-02-26T00:00:00.000Z","modified_at": "2026-02-26T14:51:45.062Z","expires_at": null,"source_url": "https://www.arbeitsagentur.de/jobsuche/jobdetail/12016-10004030577-S","source_platform": "arbeitsagentur.de","contact_salutation": "Frau","contact_firstname": "Delia","contact_lastname": "Schneider","contact_email": "wedding.pflege@perzukunft.de","contact_phone": "+49302200870","apply_url": "https://www.perzukunft.de/job/pflegeassistent-in-1201610004030577","apply_email": "wedding.pflege@perzukunft.de","company_url": null,"company_website": "http://www.perzukunft.de/","company_logo_url": "https://rest.arbeitsagentur.de/.../arbeitgeberlogo/QCGq...","company_description": "Perzukunft - Unternehmensprofil...","company_jobs_url": "https://www.arbeitsagentur.de/jobsuche/suche?angebotsart=1&arbeitgeberKundennummerHash=...","search_query": "Pflege","scraped_at": "2026-03-04T15:36:48.707Z"}
Field Reference
| Field | Source | Requires includeJobDetails |
|---|---|---|
company_website | Employer profile API (Homepage link) | Yes |
company_logo_url | Constructed from arbeitgeberKundennummerHash | Yes |
company_description | Employer profile API | Yes |
company_jobs_url | Constructed from employer hash | Yes |
description_full | v4 jobdetails API | Yes |
modified_at | v4 jobdetails API (aenderungsdatum) | Yes |
salary_min/max/period | v4 jobdetails API | Yes |
contact_* | Captcha-protected bewerbung API | Yes |
remote_option | v4 jobdetails API | Yes |
search_query | Input search keyword that found this job | No |
Usage
Multi-Query Search
{"searchQueries": ["Pflege", "Krankenschwester", "Altenpfleger"],"location": "Berlin","maxResultsPerQuery": 50,"maxResults": 0,"includeJobDetails": false}
Quick Search (without contact details)
{"searchQueries": ["Elektriker"],"location": "Munchen","maxResultsPerQuery": 50,"includeJobDetails": false}
Full Extraction (with contact details)
{"searchQueries": ["Pflege"],"bundesland": "BY","maxResultsPerQuery": 20,"includeJobDetails": true}
Legacy Single Query (backward compatible)
{"searchQuery": "Softwareentwickler","location": "Berlin","maxResults": 100,"includeJobDetails": true}
Via API
curl -X POST "https://api.apify.com/v2/acts/santamaria~arbeitsagentur-de-scraper/runs" \-H "Authorization: Bearer YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"searchQueries": ["Softwareentwickler", "Programmierer"],"location": "Berlin","employmentType": "VOLLZEIT","maxResultsPerQuery": 100,"includeJobDetails": true}'
Performance & Cost
| Mode | Memory | Speed | CU (50 jobs) |
|---|---|---|---|
Search only (includeJobDetails: false) | ~40 MB | ~25 jobs/sec | ~0.005 |
With details (includeJobDetails: true) | ~120 MB | ~1 job/3-5 sec | ~0.05 |
Tip: Use includeJobDetails: false (default) for high-volume scraping, enable only when you need contact details and company data.
Technical Details
Captcha Solver
- Model: CNN + Bidirectional LSTM with CTC loss
- Input: 250x50 grayscale image
- Accuracy: ~95% first-attempt solve rate
- Character set:
0-9,a-z(29 characters) - Format: ONNX for fast inference (onnxruntime-node)
Key Implementation Notes
- Job IDs must be base64url-encoded in API URLs (raw IDs return 404)
- Captcha flow uses plain
fetch--gotScrapingmangles AAS custom headers causing 403 - Assignment body requires
formId: 'ARBEITGEBERDATEN'andformProtectionLevel: 'JB_JOBSUCHE_20' - Bewerbung headers:
aas-info: sessionId=..., challengeId=...andaas-answer: <solution> - Phone numbers from API are structured objects
{laendervorwahl, vorwahl, rufnummer}, not flat strings - Docker image:
node:20-slim(Debian-based, required for onnxruntime-node glibc dependency)
Limitations
- Rate Limiting: 2s delay between requests to avoid blocks
- Salary Data: German job listings rarely include salary information
- Job Expiration: API doesn't provide expiration dates
- Contact Availability: Not all listings have contact details behind captcha
- Company Website: Only available when the employer has configured a Homepage link in their profile
Version History
- 3.1.0 (2026-03-17):
- Multi-query support:
searchQueriesarray with per-query limits and deduplication - Added
maxResultsPerQuery(default 100),maxResults0=unlimited - Added
search_queryoutput field to track which query found each job - Backward compatible:
searchQuery(singular) still works - Memory limit raised to 512MB
- Multi-query support:
- 3.0.0 (2026-03-04):
- Full HTTP-only migration -- removed Playwright entirely
- Added employer profile API for company website, description, logo
- Added
company_website,company_jobs_url,company_description,modified_atfields - Logo displayed as image in Apify results table
- Memory reduced from ~530MB to ~80MB, CU from 0.148 to ~0.005
- Docker:
node:20-slim(wasapify/actor-node-playwright-chrome:20)
- 2.0.0 (2026-01-26):
- Added Playwright browser automation for detail extraction
- Integrated ML-powered captcha solver (ONNX)
- Hybrid architecture: API search + browser details
- 1.0.0 (2024-12-22): Initial implementation with v4 API
Support
For issues or feature requests: Actor Issues
Part of the Santamaria Job Scrapers Suite - Professional-grade job data for the DACH region.
Built with Apify | Arbeitsagentur.de