Bundesagentur Scraper
Under maintenancePricing
from $2.99 / 1,000 results
Bundesagentur Scraper
Under maintenanceExtract job listings from Bundesagentur für Arbeit, Germany's federal employment agency (Jobbörse).
Pricing
from $2.99 / 1,000 results
Rating
0.0
(0)
Developer
Jobs Scraper
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Overview
Navigate Germany's federal employment agency portal (Bundesagentur für Arbeit / Jobbörse) to compile officially registered job openings. This actor extracts listing data from Germany's most comprehensive public employment database, covering positions across all industries and qualification levels in the DACH region.
Features
- Official German federal employment agency data
- Berufenet occupation classification integration
- Bundesland and city-level regional filtering
- Arbeitszeit and contract type specifications
- Proxy rotation with automatic fallback (residential → datacenter)
- CAPTCHA detection and session rotation
- Automatic retry on failures with exponential backoff
- Deduplication of results by application URL
- Dataset validation with auto-fix capability
Supported Inputs
| Field | Type | Default | Description |
|---|---|---|---|
keyword | string | "software engineer" | Search terms for job discovery |
location | string | "Berlin" | Geographic filter for results |
country | string | "DE" | Country code for proxy routing |
maxItems | integer | 50 | Upper limit on extracted listings |
proxyEnabled | boolean | true | Toggle proxy rotation on/off |
sortBy | string | "relevance" | Result ordering (relevance/date/salary) |
jobType | string | "" | Employment type filter |
experienceLevel | string | "" | Seniority level filter |
datePosted | string | "" | Recency filter (24h/3d/7d/14d/30d) |
remoteOnly | boolean | false | Restrict to remote positions only |
includeCompanyDetails | boolean | true | Fetch extra company information |
includeSalary | boolean | true | Include compensation data |
Output Format
Each scraped listing produces a JSON object with these fields:
{"jobTitle": "Senior Software Engineer","companyName": "Example Corp","location": "Berlin","salary": "$120,000 - $160,000","jobType": "Full-time","experienceLevel": "Senior","postedDate": "2 days ago","applyUrl": "https://www.arbeitsagentur.de/job/12345","companyUrl": "https://www.arbeitsagentur.de/company/example","description": "We are looking for a skilled engineer...","requirements": ["JavaScript", "Node.js", "React"],"benefits": ["Health Insurance", "Remote Work"],"sourcePortal": "Bundesagentur für Arbeit","country": "DE","scrapedAt": "2025-01-15T10:30:00.000Z"}
Proxy Handling
The actor employs a multi-tier proxy strategy to maximize successful data extraction.
- Apify Residential Proxy (country-targeted) — First choice for Bundesagentur für Arbeit
- Apify Residential Proxy (any region) — Fallback if country proxy unavailable
- Apify Datacenter Proxy — Secondary fallback for cost efficiency
- Direct Connection — Last resort when all proxies fail
Proxies auto-rotate on each request. Blocked sessions are discarded and replaced automatically.
Retry Logic
Failed requests are retried up to 5 times with automatic session rotation.
- Maximum 5 retries per request
- Fresh browser session on each retry
- Automatic proxy rotation between attempts
- Blocked status codes (401, 403, 429) trigger session refresh
- Configurable request timeout (120 seconds)
Anti-block Handling
The actor incorporates multiple stealth techniques to minimize detection.
navigator.webdriverproperty masked- Human-like delays between page interactions (2–5 seconds)
- Browser language and plugin fingerprints normalised
- Session pool with automatic rotation on blocks
- CAPTCHA detection with graceful retry
- Rate limit detection (HTTP 429) with backoff
Sample Input
{"keyword": "data analyst","location": "Berlin","maxItems": 25,"proxyEnabled": true,"sortBy": "date","remoteOnly": false}
Sample Output
{"jobTitle": "Data Analyst","companyName": "TechCorp International","location": "Berlin","salary": "Competitive","jobType": "Full-time","experienceLevel": "Mid-level","postedDate": "1 day ago","applyUrl": "https://www.arbeitsagentur.de/job/example-123","companyUrl": "","description": "Seeking a detail-oriented data analyst to join our growing team...","requirements": ["SQL", "Python", "Tableau"],"benefits": ["Health Insurance", "Flexible Hours"],"sourcePortal": "Bundesagentur für Arbeit","country": "DE","scrapedAt": "2025-01-15T14:22:00.000Z"}
Usage
Local Development
# Install dependenciesnpm install# Set Apify token (required for proxy)export APIFY_TOKEN=your_token_here# Run the actornpm start# Validate scraped datanode dataset-validator.js
Apify Platform
# Login to Apifyapify login# Push actor to platformapify push# Run from Apify Console or API
Deployment
- Ensure all dependencies are installed:
npm install - Authenticate with Apify:
apify login - Deploy the actor:
apify push - Configure input in the Apify Console
- Schedule runs or trigger via API / webhooks
Limitations
- Results depend on the portal's current HTML structure; layout changes may require selector updates
- Some job details (salary, benefits) may not be available for all listings
- Rate limiting by the portal may reduce throughput during high-volume scrapes
- CAPTCHA challenges may interrupt scraping on heavily protected pages
- Bundesagentur für Arbeit may modify their anti-bot measures, requiring periodic updates
- Maximum items per run is capped at 1000 to prevent excessive resource usage
- Proxy costs apply when using Apify residential or datacenter proxies