Pracuj Pl Jobs Search Scraper avatar

Pracuj Pl Jobs Search Scraper

Pricing

from $2.50 / 1,000 results

Go to Apify Store
Pracuj Pl Jobs Search Scraper

Pracuj Pl Jobs Search Scraper

Efficiently scrape job listings from Pracuj.pl, Poland's leading employment platform. Extract comprehensive data including job titles, salaries, company profiles, remote work options, and AI summaries. Perfect for recruitment agencies, market researchers, and HR analytics in the Polish job market.

Pricing

from $2.50 / 1,000 results

Rating

0.0

(0)

Developer

Austin Powers

Austin Powers

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

3 days ago

Last modified

Categories

Share

Pracuj.pl Job Scraper

Scrapes job listings from pracuj.pl — Poland's largest job board. Extracts structured data including job title, company, salary, location, work modes, and more.

What does Pracuj.pl Job Scraper do?

This actor takes one or more pracuj.pl search URLs and returns structured job offer data. It handles pagination automatically, so you get all results — not just the first page.

It works by opening each URL in a headless browser, extracting embedded JSON data from the page (fast path), and falling back to DOM parsing if needed.

How much will it cost?

Pricing: $7 per 1,000 results.

The actor uses Playwright with Firefox, which requires ~1 GB of memory. A typical run:

  • ~5 seconds per page (each page contains ~20 offers)
  • Scraping 1,000 offers takes roughly 50 pages = ~4 minutes of compute

Proxy costs depend on your configuration — the actor works without proxies locally, but on the Apify platform you may want to use datacenter or residential proxies for reliability.

Input

FieldTypeRequiredDefaultDescription
urlsstring[]YesPracuj.pl search URLs to scrape
max_items_per_urlintegerNo100Max offers per URL (0 = unlimited)
proxyobjectNoApify proxy configuration

How to build search URLs

Pracuj.pl search URLs follow this pattern:

https://www.pracuj.pl/praca/{keyword};kw/{location};wp?{params}

Examples:

What you wantURL
All jobs from last 24hhttps://www.pracuj.pl/praca?rd=0
Python jobshttps://www.pracuj.pl/praca/python;kw
Python jobs, last 24hhttps://www.pracuj.pl/praca/python;kw?rd=0
DevOps in Warszawahttps://www.pracuj.pl/praca/devops;kw/warszawa;wp
Java in Krakow, last 24hhttps://www.pracuj.pl/praca/java;kw/krakow;wp?rd=0
By category (IT)https://www.pracuj.pl/praca?cc=5015001

URL parameters:

  • rd=0 — last 24 hours only
  • cc=XXXXX — category code
  • pn=N — page number (handled automatically by the actor)

Example input

{
"urls": [
"https://www.pracuj.pl/praca/python;kw?rd=0",
"https://www.pracuj.pl/praca/devops;kw/warszawa;wp"
],
"max_items_per_url": 50,
"proxy": {
"useApifyProxy": true
}
}

Output

Each item in the dataset is a JSON object with these fields:

FieldTypeDescription
group_idstringUnique group ID (same job in multiple locations shares this)
job_titlestringPosition title
company_namestringEmployer name
company_idintegerEmployer ID on pracuj.pl
company_urlstringLink to employer profile
company_logo_urlstringCompany logo image URL
offer_urlstringDirect link to the job listing
offer_idintegerUnique offer ID
locationstringPrimary work location
is_whole_polandbooleanWhether the job is available nationwide
all_locationsarrayAll locations with individual offer URLs
salarystringSalary range with currency (if disclosed)
job_descriptionstringShort description / responsibilities
employment_typestring[]Contract types (e.g., "Umowa o pracę", "Kontrakt B2B")
work_modesstring[]Work modes (e.g., "Praca zdalna", "Praca hybrydowa")
experience_levelstring[]Required experience level
work_schedulestring[]Schedule type (e.g., "Pełny etat")
posted_atstringPublication date (ISO 8601)
expiration_datestringOffer expiration date (ISO 8601)
is_remotebooleanWhether remote work is allowed
is_super_offerbooleanPromoted/featured offer flag
is_one_click_applybooleanQuick-apply available
ai_summarystringAI-generated offer summary (HTML)

Example output

{
"group_id": "a8080000-56be-0050-2e22-08de6e285a7a",
"job_title": "Backend Python Lead",
"company_name": "ITEAMLY SP. Z O.O.",
"company_id": 1074164809,
"company_url": "https://pracodawcy.pracuj.pl/company/1074164809",
"company_logo_url": "https://logos.gpcdn.pl/loga-firm/1074164809/..._280x280.png",
"offer_url": "https://www.pracuj.pl/praca/backend-python-lead-warszawa,oferta,1004652624",
"offer_id": 1004652624,
"location": "Warszawa",
"is_whole_poland": true,
"all_locations": [
{
"location": "Warszawa",
"url": "https://www.pracuj.pl/praca/backend-python-lead-warszawa,oferta,1004652624",
"offer_id": 1004652624
}
],
"salary": "24 000–32 000 zł netto (+ VAT) / mies.",
"job_description": "Design and develop RESTful APIs using Python...",
"employment_type": ["Kontrakt B2B"],
"work_modes": ["Praca zdalna"],
"experience_level": ["Kierownik / Kierowniczka"],
"work_schedule": ["Pełny etat"],
"posted_at": "2026-02-28T13:52:00Z",
"expiration_date": "2026-03-19T22:59:59Z",
"is_remote": true,
"is_super_offer": true,
"is_one_click_apply": true,
"ai_summary": "<ul><li>Masz 8+ lat doświadczenia w rozwoju REST API...</li></ul>"
}

Tips

  • Get only fresh offers: Add ?rd=0 to any URL to filter for last 24 hours.
  • Multiple searches: Pass multiple URLs in the urls array — they run concurrently.
  • Limit results: Set max_items_per_url to control costs. Each page has ~20 offers.
  • Proxies: The actor works without proxies for small runs. For large-scale scraping, enable Apify proxy to avoid rate limiting.

How it works (technical)

  1. Opens each URL in a headless Firefox browser via Playwright
  2. Waits for the results section to render
  3. Extracts job data from embedded JSON in <script> tags (primary path — fast and reliable)
  4. Falls back to DOM parsing with CSS selectors if no JSON is found
  5. Automatically follows pagination until max_items_per_url is reached or no more pages exist
  6. Pushes each offer to the Apify dataset

Why Playwright?

Pracuj.pl is a React SPA behind Cloudflare managed challenge protection. Simple HTTP requests receive a 403 with a JavaScript challenge. A real browser is required to render the page and extract data.