Arbeitsagentur Scraper | $1 / 1k | Bundesagentur für Arbeit
Pricing
from $1.00 / 1,000 results
Arbeitsagentur Scraper | $1 / 1k | Bundesagentur für Arbeit
Scrape jobs from Germany’s Bundesagentur für Arbeit. Paste official search URLs and get full, clean, structured data including title, company, locations, contract/work type, dates, salary and more. Ideal for field research and job search.
Pricing
from $1.00 / 1,000 results
Rating
5.0
(1)
Developer
Fatih Tahta
Actor stats
7
Bookmarked
76
Total users
15
Monthly active users
2 days ago
Last modified
Categories
Share
Arbeitsagentur Scraper | Bundesagentur für Arbeit
Slug: fatihtahta/arbeitsagentur-scraper
Overview
This actor collects structured public job listing data from the Bundesagentur für Arbeit job portal, including listing identifiers, titles, employers, descriptions, categories, locations, work arrangements, contract signals, publication dates, and related application links. It is built for repeatable collection of job-market records that can be exported directly into analytics, reporting, enrichment, or operational systems. https://www.arbeitsagentur.de is one of Germany's most important public employment platforms, making it a useful source for labor-market monitoring, hiring research, and structured job intelligence. The actor produces normalized JSON output designed for recurring runs, downstream ingestion, and consistent handling across workflows. It is suited to dependable ongoing data acquisition where teams need clear inputs, stable output patterns, and automation-ready delivery.
Why Use This Actor
- Market research and analytics teams: collect structured extraction of public job demand signals by keyword, geography, publication window, and work arrangement for market intelligence and operational reporting.
- Product and content teams: monitor hiring activity, employer positioning, job taxonomy coverage, and regional trends to support editorial planning, benchmark pages, and content gap analysis.
- Developers and data engineering teams: feed normalized job records into downstream systems, ETL pipelines, warehouses, and internal services without designing custom parsing around raw listings.
- Lead generation and enrichment teams: build employer or opportunity datasets from public listings, then join them with CRM, account, or prospect data in enrichment pipelines.
- Monitoring and competitive tracking teams: run repeatable collection on schedules to observe changes in posting volume, role mix, remote availability, and hiring movement over time.
Common Use Cases
- Market intelligence: track hiring volume, role categories, remote-work patterns, and geographic distribution across selected job segments.
- Lead generation: assemble targeted employer and opportunity lists based on keywords, locations, and contract preferences.
- Competitive monitoring: compare how employers or regions change job demand, publication cadence, and role mix over repeated runs.
- Catalog and directory building: populate internal job databases or searchable directories with structured public records.
- Data enrichment: append current public job attributes to CRM, BI, or recruiting datasets.
- Recurring reporting: schedule periodic runs for dashboards, alerts, and labor-market trend summaries.
- Fresh-post monitoring: focus on recently published listings to support outreach, editorial coverage, or high-frequency review workflows.
Quick Start
- Choose one input mode: build a search with keyword and filter fields, or provide direct Bundesagentur search-result URLs in
startUrls. - For a first validation run, set a small
limitsuch as25or50. - Configure the filters that define your target scope, such as
location,radius,publication_date, orwork_schedule. - Run the actor in Apify Console.
- Inspect the first dataset records to confirm the structure, field coverage, and values match your use case.
- Increase
limit, refine filters, or add a schedule once the output is validated.
Input Parameters
This actor supports two input modes: query-building with filters, or direct Bundesagentur search-result URLs.
| Parameter | Type | Description | Default |
|---|---|---|---|
keyword | string | Words or phrases used to narrow results, such as job titles, skills, or technologies. Leave empty when your direct search URL already contains the intended search terms. | – |
location | string | City, district, postal code, or region to search around. When combined with direct URLs, this value overrides the location embedded in those URLs. | – |
employment_type | string | Offer type used by the built-in query builder. Allowed values: job, education, early_career, self_employed. These map to the Bundesagentur angebotsart search parameter. | job |
radius | string | Search radius around location. Allowed values: 10_km, 15_km, 25_km, 50_km, 100_km, 200_km. Units are kilometers. | – |
is_remote | boolean | When true, limits results to jobs marked as remote or home-office friendly. | – |
is_suitable_for_career_change | boolean | When true, focuses on roles marked as suitable for career changers. | – |
beginning_date | array of strings | One or more preferred start periods. Allowed values: from_now_on, may_2026, june_2026, july_2026, august_2026, september_2026, october_2026, november_2026, december_2026, january_2027, february_2027, march_2027, april_2027, may_2027, june_2027, july_2027, august_2027, september_2027, october_2027, february_2028, april_2028. | – |
fixed_term | array of strings | Contract duration filters. Allowed values: temporary, indefinite. | – |
include_agency_jobs | boolean | When true, includes listings from private recruitment agencies in addition to direct employer postings. | – |
exclude_external_jobs | boolean | When true, removes listings that point to external job boards or third-party sources. | – |
exclude_temporary_work | boolean | When true, removes temporary-work or labor-leasing roles from results. | – |
disabled_only_jobs | boolean | When true, limits results to listings marked as suitable for disabled applicants. | – |
publication_date | string | Recency filter for published listings. Allowed values: today, yesterday, last_1_week, last_2_weeks, last_4_weeks. | – |
work_schedule | array of strings | One or more schedule filters. Allowed values: full_time, part_time, shift_night_weekend, part_time_job. | – |
startUrls | array of strings | Direct Bundesagentur search-result URLs. Use this as an alternative to the query-builder fields when you want to reuse searches configured on the target site. | – |
limit | integer | Maximum number of job listings to collect in a run. Useful for both small validation runs and larger recurring collection jobs. | 50000 |
Choosing Inputs
Use the query-builder fields when you want a controlled, repeatable search definition that can be adjusted through structured inputs such as keyword, location, employment_type, radius, publication_date, and schedule or contract filters. Use employment_type to switch the base Bundesagentur offer category before the other filters are applied. Use startUrls when you already have Bundesagentur search-result URLs and want to reuse that exact search scope without rebuilding it manually. Narrower filters usually produce more targeted datasets with less manual cleanup, while broader filters improve discovery and coverage. If you use location and radius, they directly shape geographic scope; if you use publication_date, they shape freshness and monitoring cadence. Start with a small limit to validate record quality, then increase coverage once the dataset structure matches your intended workflow.
Example Inputs
Scenario: Keyword and location search
{"keyword": "Data Engineer","location": "Berlin","employment_type": "job","radius": "25_km","is_remote": true,"publication_date": "last_1_week","limit": 50}
Scenario: Direct URL run
{"startUrls": ["https://www.arbeitsagentur.de/jobsuche/suche?angebotsart=1&was=softwareentwickler&wo=Hamburg&umkreis=15"],"exclude_external_jobs": true,"include_agency_jobs": false,"work_schedule": ["full_time"],"limit": 100}
Scenario: Targeted accessibility and contract filtering
{"keyword": "Sachbearbeiter","location": "Köln","radius": "10_km","disabled_only_jobs": true,"fixed_term": ["indefinite"],"exclude_temporary_work": true,"publication_date": "last_2_weeks","limit": 75}
Output
9.1 Output destination
The actor writes results to an Apify dataset as JSON records. The dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs without post-processing.
Each item contains a stable record envelope plus a type-specific payload when the output has multiple entity types. For this actor, the primary record type is a job listing.
9.2 Record envelope (all items)
type(string, required): record type. For this actor, usejob.id(string, required): stable source record identifier for the job listing.url(string, required): canonical record URL used as the public record reference in downstream systems.
Recommended idempotency key: type + ":" + id
Use the idempotency key for deduplication and upserts when syncing records into warehouses, CRMs, search indexes, or internal data stores. The envelope makes records easier to merge, deduplicate, and synchronize across repeated runs.
9.3 Examples
Example: job (type = "job")
{"type": "job","id": "13644-307198-S","url": "https://www.get-in-it.de/jobsuche/p307198?utm_source=arbeitsagentur&utm_medium=organic&utm_campaign=launch-basic","source": "arbeitsagentur","sourceContext": {"seed": {"type": "start_url","value": "https://www.arbeitsagentur.de/jobsuche/suche?angebotsart=1&was=software&wo=Berlin&umkreis=10"},"request": {"searchUrl": "https://www.arbeitsagentur.de/jobsuche/suche?angebotsart=1&was=software&wo=Berlin&umkreis=10","page": 1},"detailUrl": "https://www.arbeitsagentur.de/jobsuche/suche?angebotsart=1&was=software&wo=Berlin&umkreis=10&id=13644-307198-S","scrapedAt": "2026-04-26T15:16:32.498Z"},"title": "Software Engineer mit Schwerpunkt im Data Engineering (m/w/d)","company": "CONTACT Software","description": "## Ihre Aufgaben\n\n- Konzeption und Entwicklung der Architektur eines Data-Warehouses sowie Integration von Daten aus verschiedenen Unternehmensbereichen\n- Betrieb der erforderlichen Softwaresysteme auf Kubernetes-Clustern unter Anwendung von DevOps-Praktiken\n- Definition einer Methodik zur Performance-Überwachung und Unterstützung datengetriebener Entscheidungsprozesse unserer Fachbereiche\n- Vermittlung von Datenanalyse-Know-how und Förderung einer „data-driven“-Kultur im Unternehmen\n\n## Das sollten Sie mitbringen\n\n- abgeschlossenes Studium der Informatik, der Digitalen Medien oder eines verwandten technischen Fachbereiches mit ausgewiesenem Informatik-Schwerpunkt, alternativ eine vergleichbare IT-Ausbildung mit mehrjähriger Berufserfahrung \n- fundierte Kenntnisse in Theorie und Praxis des Data- und Software-Engineerings\n analytische und kommunikative Fähigkeiten sowie eine offene Persönlichkeit\n- Erfahrung im Betrieb von Softwaresystemen unter Anwendung von DevOps-Praktiken\n- Freude an Coaching und Wissensvermittlung im Team\n- Erfahrung mit mindestens einem Teil unseres Technologie- und Werkzeug-Stacks (Python, Linux/Docker/Kubernetes/Helm, PostgreSQL, dbt, Apache Superset, ggf. Apache Airbyte und Apache Airflow)\n- sicheres Verfassen anspruchsvoller technischer Texte auf Englisch sowie sehr gute Deutschkenntnisse in Wort und Schrift\n\n## Unser Angebot\n\n- spannende Projekte und abwechslungsreiche Aufgaben im IT-Umfeld auf Basis der leistungsfähigsten Plattform für PLM, Projektmanagement, IoT und Public Sector\n- ein marktgerechtes Gehalt und flache Hierarchien innerhalb einer stark wachsenden Organisation mit „Du“-Kultur bis zur Führungsebene\n- individuelles und professionelles Onboarding mit Mentoring-Programm und Unterstützung durch Ihre Teamkolleg\\*innen\n- Auswahl aus diversen Weiterbildungen aus unserem Schulungskatalog, zur fachlichen und persönlichen Weiterentwicklung\n- eigene Wahl der Entwicklungsumgebung\n- Vergünstigungen für Firmenfitness sowie die Teilnahme an verschiedenen Sportgruppen, regelmäßigen Team- und Firmen-Events\n- Wahl zwischen der Arbeit an einem unserer Standorte, im Homeoffice oder einer hybriden Tätigkeit\n- offene und wertschätzende Unternehmenskultur, in der eigene Ideen nicht nur erlaubt, sondern auch gern gehört werden\n- flexible Arbeitszeiten mit Zeiterfassung und 30 Tagen Urlaub (bei einer 5-Tage-Woche)\n- Zuschuss zum Deutschlandticket\n- Angebote zur mentalen Gesundheit\n- frisches Obst und diverse Heiß- und Kaltgetränke an unseren Standorten","category": "Anwendungsprogrammierer/in","categories": ["Anwendungsprogrammierer/in"],"listingType": "ARBEIT","contractType": "KEINE_ANGABE","workModes": ["HOME_OFFICE"],"miniJob": false,"careerChangeFriendly": false,"disabilityFriendly": false,"privatePlacementService": false,"temporaryAgency": false,"managedListing": false,"dates": {"publishedAt": "2026-04-23","updatedAt": "2026-04-23T07:01:15.210","startsAt": "2026-04-23"},"compensation": {"payType": "KEINE_ANGABEN"},"locations": [{"city": "Karlsruhe, Baden","postalCode": "76133","region": "BADEN_WUERTTEMBERG","country": "DEUTSCHLAND","lat": 49.0130301,"lon": 8.3905711},{"city": "Kaiserslautern","postalCode": "67655","region": "RHEINLAND_PFALZ","country": "DEUTSCHLAND","lat": 49.4414369,"lon": 7.7661552},{"city": "Bad Vilbel","postalCode": "61118","region": "HESSEN","country": "DEUTSCHLAND","lat": 50.1906016,"lon": 8.7449458},{"city": "Augsburg, Bayern","postalCode": "86150","region": "BAYERN","country": "DEUTSCHLAND","lat": 48.3648495,"lon": 10.8925892},{"city": "Berlin","postalCode": "10178","region": "BERLIN","country": "DEUTSCHLAND","lat": 52.5210906,"lon": 13.4099785},{"city": "Paderborn","postalCode": "33098","region": "NORDRHEIN_WESTFALEN","country": "DEUTSCHLAND","lat": 51.7087892,"lon": 8.7514619},{"city": "Köln","postalCode": "50676","region": "NORDRHEIN_WESTFALEN","country": "DEUTSCHLAND","lat": 50.9307423,"lon": 6.952315},{"city": "Ingolstadt, Donau","postalCode": "85055","region": "BAYERN","country": "DEUTSCHLAND","lat": 48.7862559,"lon": 11.4454862},{"city": "München","postalCode": "80331","region": "BAYERN","country": "DEUTSCHLAND","lat": 48.1355827,"lon": 11.5739758},{"city": "Bremen","postalCode": "28195","region": "BREMEN","country": "DEUTSCHLAND","lat": 53.079582,"lon": 8.805886}],"distanceKm": 2,"remoteWork": {"available": true,"arrangementType": "NACH_VEREINBARUNG"},"contactReference": "307198","applicationUrl": "https://www.get-in-it.de/jobsuche/p307198?utm_source=arbeitsagentur&utm_medium=organic&utm_campaign=launch-basic","postingPartner": {"name": "get in GmbH","url": "https://www.get-in-IT.de"},"referenceNumber": "13644-307198-S","sourceData": {"search": {"title": "Software Engineer mit Schwerpunkt im Data Engineering (m/w/d)","company": "CONTACT Software","listingType": "ARBEIT","category": "Anwendungsprogrammierer/in","categories": ["Anwendungsprogrammierer/in"],"dates": {"publishedAt": "2026-04-23","updatedAt": "2026-04-23T07:01:15.210","startsAt": "2026-04-23"},"compensation": {"payType": "KEINE_ANGABEN"},"contract": {"type": "KEINE_ANGABE","miniJob": false},"workModes": ["HOME_OFFICE"],"workSchedule": {"fullTime": false,"partTimeMorning": false,"partTimeAfternoon": false,"partTimeEvening": false,"partTimeFlexible": false,"shiftNightWeekend": false},"remoteWork": {"available": true,"arrangementType": "NACH_VEREINBARUNG"},"locations": [{"city": "Berlin","postalCode": "10178","region": "BERLIN","country": "DEUTSCHLAND","lat": 52.5210906,"lon": 13.4099785}],"distanceKm": 2,"contactReference": "307198","applicationUrl": "https://www.get-in-it.de/jobsuche/p307198?utm_source=arbeitsagentur&utm_medium=organic&utm_campaign=launch-basic","referenceNumber": "13644-307198-S","flags": {"careerChangeFriendly": false}},"detail": {"title": "Software Engineer mit Schwerpunkt im Data Engineering (m/w/d)","description": "## Ihre Aufgaben\n\n- Konzeption und Entwicklung der Architektur eines Data-Warehouses sowie Integration von Daten aus verschiedenen Unternehmensbereichen\n- Betrieb der erforderlichen Softwaresysteme auf Kubernetes-Clustern unter Anwendung von DevOps-Praktiken\n- Definition einer Methodik zur Performance-Überwachung und Unterstützung datengetriebener Entscheidungsprozesse unserer Fachbereiche\n- Vermittlung von Datenanalyse-Know-how und Förderung einer „data-driven“-Kultur im Unternehmen\n\n## Das sollten Sie mitbringen\n\n- abgeschlossenes Studium der Informatik, der Digitalen Medien oder eines verwandten technischen Fachbereiches mit ausgewiesenem Informatik-Schwerpunkt, alternativ eine vergleichbare IT-Ausbildung mit mehrjähriger Berufserfahrung \n- fundierte Kenntnisse in Theorie und Praxis des Data- und Software-Engineerings\n analytische und kommunikative Fähigkeiten sowie eine offene Persönlichkeit\n- Erfahrung im Betrieb von Softwaresystemen unter Anwendung von DevOps-Praktiken\n- Freude an Coaching und Wissensvermittlung im Team\n- Erfahrung mit mindestens einem Teil unseres Technologie- und Werkzeug-Stacks (Python, Linux/Docker/Kubernetes/Helm, PostgreSQL, dbt, Apache Superset, ggf. Apache Airbyte und Apache Airflow)\n- sicheres Verfassen anspruchsvoller technischer Texte auf Englisch sowie sehr gute Deutschkenntnisse in Wort und Schrift\n\n## Unser Angebot\n\n- spannende Projekte und abwechslungsreiche Aufgaben im IT-Umfeld auf Basis der leistungsfähigsten Plattform für PLM, Projektmanagement, IoT und Public Sector\n- ein marktgerechtes Gehalt und flache Hierarchien innerhalb einer stark wachsenden Organisation mit „Du“-Kultur bis zur Führungsebene\n- individuelles und professionelles Onboarding mit Mentoring-Programm und Unterstützung durch Ihre Teamkolleg\\*innen\n- Auswahl aus diversen Weiterbildungen aus unserem Schulungskatalog, zur fachlichen und persönlichen Weiterentwicklung\n- eigene Wahl der Entwicklungsumgebung\n- Vergünstigungen für Firmenfitness sowie die Teilnahme an verschiedenen Sportgruppen, regelmäßigen Team- und Firmen-Events\n- Wahl zwischen der Arbeit an einem unserer Standorte, im Homeoffice oder einer hybriden Tätigkeit\n- offene und wertschätzende Unternehmenskultur, in der eigene Ideen nicht nur erlaubt, sondern auch gern gehört werden\n- flexible Arbeitszeiten mit Zeiterfassung und 30 Tagen Urlaub (bei einer 5-Tage-Woche)\n- Zuschuss zum Deutschlandticket\n- Angebote zur mentalen Gesundheit\n- frisches Obst und diverse Heiß- und Kaltgetränke an unseren Standorten","company": "CONTACT Software","listingType": "ARBEIT","category": "Anwendungsprogrammierer/in","categories": ["Anwendungsprogrammierer/in"],"dates": {"publishedAt": "2026-04-23","updatedAt": "2026-04-23T07:01:15.210","startsAt": "2026-04-23"},"compensation": {"payType": "KEINE_ANGABEN"},"contract": {"type": "KEINE_ANGABE","miniJob": false},"workModes": ["HOME_OFFICE"],"workSchedule": {"fullTime": false,"partTimeMorning": false,"partTimeAfternoon": false,"partTimeEvening": false,"partTimeFlexible": false,"shiftNightWeekend": false},"remoteWork": {"available": true,"arrangementType": "NACH_VEREINBARUNG"},"locations": [{"city": "Karlsruhe, Baden","postalCode": "76133","region": "BADEN_WUERTTEMBERG","country": "DEUTSCHLAND","lat": 49.0130301,"lon": 8.3905711},{"city": "Kaiserslautern","postalCode": "67655","region": "RHEINLAND_PFALZ","country": "DEUTSCHLAND","lat": 49.4414369,"lon": 7.7661552},{"city": "Bad Vilbel","postalCode": "61118","region": "HESSEN","country": "DEUTSCHLAND","lat": 50.1906016,"lon": 8.7449458},{"city": "Augsburg, Bayern","postalCode": "86150","region": "BAYERN","country": "DEUTSCHLAND","lat": 48.3648495,"lon": 10.8925892},{"city": "Berlin","postalCode": "10178","region": "BERLIN","country": "DEUTSCHLAND","lat": 52.5210906,"lon": 13.4099785},{"city": "Paderborn","postalCode": "33098","region": "NORDRHEIN_WESTFALEN","country": "DEUTSCHLAND","lat": 51.7087892,"lon": 8.7514619},{"city": "Köln","postalCode": "50676","region": "NORDRHEIN_WESTFALEN","country": "DEUTSCHLAND","lat": 50.9307423,"lon": 6.952315},{"city": "Ingolstadt, Donau","postalCode": "85055","region": "BAYERN","country": "DEUTSCHLAND","lat": 48.7862559,"lon": 11.4454862},{"city": "München","postalCode": "80331","region": "BAYERN","country": "DEUTSCHLAND","lat": 48.1355827,"lon": 11.5739758},{"city": "Bremen","postalCode": "28195","region": "BREMEN","country": "DEUTSCHLAND","lat": 53.079582,"lon": 8.805886}],"contactReference": "307198","applicationUrl": "https://www.get-in-it.de/jobsuche/p307198?utm_source=arbeitsagentur&utm_medium=organic&utm_campaign=launch-basic","postingPartner": {"name": "get in GmbH","url": "https://www.get-in-IT.de"},"referenceNumber": "13644-307198-S","flags": {"careerChangeFriendly": false,"disabilityFriendly": false,"privatePlacementService": false,"temporaryAgency": false,"managedListing": false}}}}
Field Reference
Record type: job
type(string, required): Record type. Usejob.id(string, required): Stable source identifier for the job record.url(string, required): Canonical URL used to reference the job in downstream systems.source(string, required): Source system label.sourceContext.seed.type(string, required): Input seed type used for the run.sourceContext.seed.value(string, required): Public Arbeitsagentur search page used to discover the record.sourceContext.request.searchUrl(string, required): Public Arbeitsagentur search page associated with the result set.sourceContext.request.page(integer, required): Result page number associated with discovery.sourceContext.detailUrl(string, optional): Public Arbeitsagentur detail page captured during collection.sourceContext.scrapedAt(string, required): Record collection timestamp in ISO format.title(string, required): Job title.company(string, optional): Employer or organization name.description(string, optional): Full public listing description.category(string, optional): Primary occupational category.categories(array of strings, optional): Occupational categories associated with the listing.listingType(string, optional): Listing classification from the source.contractType(string, optional): Contract duration or contract-type signal.workModes(array of strings, optional): Work-mode indicators such as remote or schedule-related tags.miniJob(boolean, optional): Whether the listing is marked as a mini-job.careerChangeFriendly(boolean, optional): Whether the listing is marked as suitable for career changers.disabilityFriendly(boolean, optional): Whether the listing is marked as suitable for disabled applicants.privatePlacementService(boolean, optional): Whether the listing is associated with a private placement service.temporaryAgency(boolean, optional): Whether the listing is associated with temporary agency work.managedListing(boolean, optional): Whether the listing is marked as managed by a partner source.dates.publishedAt(string, optional): Public publication date.dates.updatedAt(string, optional): Last update timestamp from the source.dates.startsAt(string, optional): Advertised start date.compensation.payType(string, optional): Compensation availability or pay-type label.locations(array of objects, optional): Structured job locations.locations[].city(string, optional): City or locality.locations[].postalCode(string, optional): Postal code.locations[].region(string, optional): Region or state code/name from the source.locations[].country(string, optional): Country label.locations[].lat(number, optional): Latitude coordinate.locations[].lon(number, optional): Longitude coordinate.distanceKm(number, optional): Distance from the searched location in kilometers.remoteWork.available(boolean, optional): Whether remote work is available.remoteWork.arrangementType(string, optional): Remote-work arrangement label.contactReference(string, optional): Source contact reference.applicationUrl(string, optional): External or direct application URL.postingPartner.name(string, optional): Partner or posting platform name.postingPartner.url(string, optional): Partner or posting platform URL.referenceNumber(string, optional): Source reference number for the listing.sourceData.search.title(string, optional): Title as seen in the search-layer data.sourceData.search.company(string, optional): Company as seen in the search-layer data.sourceData.search.listingType(string, optional): Listing type from the search-layer data.sourceData.search.category(string, optional): Primary category from the search-layer data.sourceData.search.categories(array of strings, optional): Categories from the search-layer data.sourceData.search.dates.publishedAt(string, optional): Search-layer publication date.sourceData.search.dates.updatedAt(string, optional): Search-layer update timestamp.sourceData.search.dates.startsAt(string, optional): Search-layer start date.sourceData.search.compensation.payType(string, optional): Search-layer compensation label.sourceData.search.contract.type(string, optional): Search-layer contract type.sourceData.search.contract.miniJob(boolean, optional): Search-layer mini-job flag.sourceData.search.workModes(array of strings, optional): Search-layer work-mode labels.sourceData.search.workSchedule.fullTime(boolean, optional): Search-layer full-time flag.sourceData.search.workSchedule.partTimeMorning(boolean, optional): Search-layer morning part-time flag.sourceData.search.workSchedule.partTimeAfternoon(boolean, optional): Search-layer afternoon part-time flag.sourceData.search.workSchedule.partTimeEvening(boolean, optional): Search-layer evening part-time flag.sourceData.search.workSchedule.partTimeFlexible(boolean, optional): Search-layer flexible part-time flag.sourceData.search.workSchedule.shiftNightWeekend(boolean, optional): Search-layer shift, night, or weekend flag.sourceData.search.remoteWork.available(boolean, optional): Search-layer remote-work availability.sourceData.search.remoteWork.arrangementType(string, optional): Search-layer remote-work arrangement label.sourceData.search.locations(array of objects, optional): Locations from the search-layer data.sourceData.search.locations[].city(string, optional): Search-layer city.sourceData.search.locations[].postalCode(string, optional): Search-layer postal code.sourceData.search.locations[].region(string, optional): Search-layer region.sourceData.search.locations[].country(string, optional): Search-layer country.sourceData.search.locations[].lat(number, optional): Search-layer latitude.sourceData.search.locations[].lon(number, optional): Search-layer longitude.sourceData.search.distanceKm(number, optional): Search-layer distance in kilometers.sourceData.search.contactReference(string, optional): Search-layer contact reference.sourceData.search.applicationUrl(string, optional): Search-layer application URL.sourceData.search.referenceNumber(string, optional): Search-layer reference number.sourceData.search.flags.careerChangeFriendly(boolean, optional): Search-layer career-change flag.sourceData.detail.title(string, optional): Title from the detail-layer data.sourceData.detail.description(string, optional): Description from the detail-layer data.sourceData.detail.company(string, optional): Company from the detail-layer data.sourceData.detail.listingType(string, optional): Listing type from the detail-layer data.sourceData.detail.category(string, optional): Primary category from the detail-layer data.sourceData.detail.categories(array of strings, optional): Categories from the detail-layer data.sourceData.detail.dates.publishedAt(string, optional): Detail-layer publication date.sourceData.detail.dates.updatedAt(string, optional): Detail-layer update timestamp.sourceData.detail.dates.startsAt(string, optional): Detail-layer start date.sourceData.detail.compensation.payType(string, optional): Detail-layer compensation label.sourceData.detail.contract.type(string, optional): Detail-layer contract type.sourceData.detail.contract.miniJob(boolean, optional): Detail-layer mini-job flag.sourceData.detail.workModes(array of strings, optional): Detail-layer work-mode labels.sourceData.detail.workSchedule.fullTime(boolean, optional): Detail-layer full-time flag.sourceData.detail.workSchedule.partTimeMorning(boolean, optional): Detail-layer morning part-time flag.sourceData.detail.workSchedule.partTimeAfternoon(boolean, optional): Detail-layer afternoon part-time flag.sourceData.detail.workSchedule.partTimeEvening(boolean, optional): Detail-layer evening part-time flag.sourceData.detail.workSchedule.partTimeFlexible(boolean, optional): Detail-layer flexible part-time flag.sourceData.detail.workSchedule.shiftNightWeekend(boolean, optional): Detail-layer shift, night, or weekend flag.sourceData.detail.remoteWork.available(boolean, optional): Detail-layer remote-work availability.sourceData.detail.remoteWork.arrangementType(string, optional): Detail-layer remote-work arrangement label.sourceData.detail.locations(array of objects, optional): Locations from the detail-layer data.sourceData.detail.locations[].city(string, optional): Detail-layer city.sourceData.detail.locations[].postalCode(string, optional): Detail-layer postal code.sourceData.detail.locations[].region(string, optional): Detail-layer region.sourceData.detail.locations[].country(string, optional): Detail-layer country.sourceData.detail.locations[].lat(number, optional): Detail-layer latitude.sourceData.detail.locations[].lon(number, optional): Detail-layer longitude.sourceData.detail.contactReference(string, optional): Detail-layer contact reference.sourceData.detail.applicationUrl(string, optional): Detail-layer application URL.sourceData.detail.postingPartner.name(string, optional): Detail-layer partner name.sourceData.detail.postingPartner.url(string, optional): Detail-layer partner URL.sourceData.detail.referenceNumber(string, optional): Detail-layer reference number.sourceData.detail.flags.careerChangeFriendly(boolean, optional): Detail-layer career-change flag.sourceData.detail.flags.disabilityFriendly(boolean, optional): Detail-layer disability-accessibility flag.sourceData.detail.flags.privatePlacementService(boolean, optional): Detail-layer private-placement flag.sourceData.detail.flags.temporaryAgency(boolean, optional): Detail-layer temporary-agency flag.sourceData.detail.flags.managedListing(boolean, optional): Detail-layer managed-listing flag.
Data Quality, Guarantees, And Handling
- Structured records: results are normalized into predictable JSON objects for downstream use.
- Best-effort extraction: fields may vary by region, session, availability, or target-site presentation changes.
- Optional fields: null-check in downstream code.
- Deduplication: recommend
type + ":" + id. - Freshness: results reflect the publicly available data at run time.
- Repeated runs: use the recommended idempotency key when syncing data into warehouses, CRMs, or search indexes.
Tips For Best Results
- Start with a small
limitto validate the output shape before scaling up. - Use one geography or search segment per run when you need cleaner downstream segmentation.
- Leave optional filters empty when your goal is broad discovery rather than narrow targeting.
- Add filters gradually so you can see how
location,publication_date,work_schedule, and other fields affect coverage. - Use
startUrlswhen you already have reliable search-result URLs you want to reuse consistently. - Schedule recurring runs for monitoring workflows instead of relying on manual one-off collection.
- Use
type + ":" + idas your stable deduplication key across repeated runs.
How to Run on Apify
- Open the actor in Apify Console.
- Configure the available input fields for the target scope.
- Set the maximum number of outputs to collect with
limit. - Click Start and wait for the run to finish.
- Review the dataset and download results in JSON, CSV, Excel, or other supported formats.
Scheduling & Automation
Scheduling
Automated Data Collection
You can schedule this actor to keep job datasets current without running it manually each time. Scheduled runs are useful for recurring monitoring, reporting, and enrichment workflows.
- Navigate to Schedules in Apify Console
- Create a new schedule (daily, weekly, or custom cron)
- Configure input parameters
- Enable notifications for run completion
- Add webhooks for automated processing
Integration Options
- CRM enrichment: sync employer, location, and listing attributes into account, lead, or recruiting records.
- Google Sheets or Airtable: review small to medium job datasets in lightweight operational workflows.
- Webhooks: trigger ingestion, alerting, validation, or follow-on processing after each completed run.
- Data warehouses and ETL pipelines: load normalized job records into historical tables for reporting and analysis.
- BI dashboards: monitor hiring volume, regional activity, remote-work patterns, and publication trends over time.
- API-driven applications: consume dataset output programmatically in internal tools, search services, or data products.
Export Formats And Downstream Use
Apify datasets can be exported directly or consumed programmatically, which makes the actor suitable for both manual review and automated delivery workflows.
- JSON: for APIs, applications, and data pipelines
- CSV or Excel: for spreadsheet workflows and manual review
- API access: for automated ingestion into internal systems
- BI and warehouses: for reporting, dashboards, and historical analysis
Performance
Estimated run times:
- Small runs (< 1,000 outputs): ~3–5 minutes
- Medium runs (1,000–5,000 outputs): ~5–15 minutes
- Large runs (5,000+ outputs): ~15–30 minutes
Execution time varies based on filters, result volume, and how much information is returned per record. Highly filtered runs can finish faster, while broad discovery or detail-rich records may take longer.
Limitations
- Availability depends on what https://www.arbeitsagentur.de publicly exposes at run time.
- Some optional fields may be missing on sparse or minimally populated listings.
- Very broad searches may take longer or require a higher
limitto capture the desired volume. - Changes on the target site can affect field availability, labels, or record shape over time.
- Regional differences and source-specific listing variations can change which records and attributes are visible.
Troubleshooting
- No results returned: check your filters, location spelling, direct URLs, and whether the target site currently has matching public listings.
- Fewer results than expected: broaden the filters, increase
limit, or verify that enough matching listings exist for the selected scope. - Some fields are empty: optional fields depend on what each listing publicly provides.
- Run takes longer than expected: reduce the scope, lower
limitfor validation, or split broad collection into smaller segments. - Output changed: compare the current output with the field reference and include a small sample if support is needed.
FAQ
What data does this actor collect?
It collects structured public job-listing data from Bundesagentur für Arbeit, including identifiers, titles, employers, descriptions, categories, dates, work-arrangement signals, locations, and related application references when available.
Can I filter by location, date, work schedule, or contract type?
Yes. The input schema supports filters such as location, radius, publication_date, work_schedule, fixed_term, remote-work flags, and several inclusion or exclusion options.
Can I use direct search URLs instead of structured filters?
Yes. Use startUrls when you already have Bundesagentur search-result URLs and want to collect from that exact public search scope.
Why did I receive fewer results than my limit?
limit is a maximum, not a guarantee. If the selected search scope contains fewer matching public listings, the dataset will contain fewer records.
Can I schedule recurring runs?
Yes. Apify schedules can run the actor daily, weekly, or on a custom cron to support monitoring and automated refresh workflows.
How do I avoid duplicates across runs?
Use the recommended idempotency key type + ":" + id in your downstream system for deduplication and upserts.
Can I export the data to CSV, Excel, or JSON?
Yes. Apify datasets support export and download in JSON, CSV, Excel, and other supported formats.
Does this actor collect private data?
It is intended to collect publicly available job-listing information exposed on the target site. Users remain responsible for reviewing and handling any personal or sensitive information appropriately.
What should I include when reporting an issue?
Include the input used in redacted form, the Apify run ID, expected versus actual behavior, and optionally a small output sample that shows the issue.
Compliance & Ethics
Responsible Data Collection
This actor collects publicly available job listing information from https://www.arbeitsagentur.de for legitimate business purposes, including:
- labor market research and market analysis
- recruiting operations and enrichment
- reporting and monitoring workflows
Users are responsible for ensuring their use of the data complies with applicable laws, regulations, and contractual obligations. This section is informational and not legal advice.
Best Practices
- Use collected data in accordance with applicable laws, regulations, and the target site's terms
- Respect individual privacy and personal information
- Use data responsibly and avoid disruptive or excessive collection
- Do not use this actor for spamming, harassment, or other harmful purposes
- Follow relevant data protection requirements where applicable (e.g., GDPR, CCPA)
Support
For help, use the actor page or repository Issues. When reporting a problem, include the input used in redacted form, the run ID, expected versus actual behavior, and an optional small output sample so the issue can be reproduced and triaged efficiently.