Pracuj Pl Jobs Search Scraper
Pricing
from $2.50 / 1,000 results
Pracuj Pl Jobs Search Scraper
Efficiently scrape job listings from Pracuj.pl, Poland's leading employment platform. Extract comprehensive data including job titles, salaries, company profiles, remote work options, and AI summaries. Perfect for recruitment agencies, market researchers, and HR analytics in the Polish job market.
Pricing
from $2.50 / 1,000 results
Rating
0.0
(0)
Developer

Austin Powers
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
3 days ago
Last modified
Categories
Share
Pracuj.pl Job Scraper
Scrapes job listings from pracuj.pl — Poland's largest job board. Extracts structured data including job title, company, salary, location, work modes, and more.
What does Pracuj.pl Job Scraper do?
This actor takes one or more pracuj.pl search URLs and returns structured job offer data. It handles pagination automatically, so you get all results — not just the first page.
It works by opening each URL in a headless browser, extracting embedded JSON data from the page (fast path), and falling back to DOM parsing if needed.
How much will it cost?
Pricing: $7 per 1,000 results.
The actor uses Playwright with Firefox, which requires ~1 GB of memory. A typical run:
- ~5 seconds per page (each page contains ~20 offers)
- Scraping 1,000 offers takes roughly 50 pages = ~4 minutes of compute
Proxy costs depend on your configuration — the actor works without proxies locally, but on the Apify platform you may want to use datacenter or residential proxies for reliability.
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
urls | string[] | Yes | — | Pracuj.pl search URLs to scrape |
max_items_per_url | integer | No | 100 | Max offers per URL (0 = unlimited) |
proxy | object | No | — | Apify proxy configuration |
How to build search URLs
Pracuj.pl search URLs follow this pattern:
https://www.pracuj.pl/praca/{keyword};kw/{location};wp?{params}
Examples:
| What you want | URL |
|---|---|
| All jobs from last 24h | https://www.pracuj.pl/praca?rd=0 |
| Python jobs | https://www.pracuj.pl/praca/python;kw |
| Python jobs, last 24h | https://www.pracuj.pl/praca/python;kw?rd=0 |
| DevOps in Warszawa | https://www.pracuj.pl/praca/devops;kw/warszawa;wp |
| Java in Krakow, last 24h | https://www.pracuj.pl/praca/java;kw/krakow;wp?rd=0 |
| By category (IT) | https://www.pracuj.pl/praca?cc=5015001 |
URL parameters:
rd=0— last 24 hours onlycc=XXXXX— category codepn=N— page number (handled automatically by the actor)
Example input
{"urls": ["https://www.pracuj.pl/praca/python;kw?rd=0","https://www.pracuj.pl/praca/devops;kw/warszawa;wp"],"max_items_per_url": 50,"proxy": {"useApifyProxy": true}}
Output
Each item in the dataset is a JSON object with these fields:
| Field | Type | Description |
|---|---|---|
group_id | string | Unique group ID (same job in multiple locations shares this) |
job_title | string | Position title |
company_name | string | Employer name |
company_id | integer | Employer ID on pracuj.pl |
company_url | string | Link to employer profile |
company_logo_url | string | Company logo image URL |
offer_url | string | Direct link to the job listing |
offer_id | integer | Unique offer ID |
location | string | Primary work location |
is_whole_poland | boolean | Whether the job is available nationwide |
all_locations | array | All locations with individual offer URLs |
salary | string | Salary range with currency (if disclosed) |
job_description | string | Short description / responsibilities |
employment_type | string[] | Contract types (e.g., "Umowa o pracę", "Kontrakt B2B") |
work_modes | string[] | Work modes (e.g., "Praca zdalna", "Praca hybrydowa") |
experience_level | string[] | Required experience level |
work_schedule | string[] | Schedule type (e.g., "Pełny etat") |
posted_at | string | Publication date (ISO 8601) |
expiration_date | string | Offer expiration date (ISO 8601) |
is_remote | boolean | Whether remote work is allowed |
is_super_offer | boolean | Promoted/featured offer flag |
is_one_click_apply | boolean | Quick-apply available |
ai_summary | string | AI-generated offer summary (HTML) |
Example output
{"group_id": "a8080000-56be-0050-2e22-08de6e285a7a","job_title": "Backend Python Lead","company_name": "ITEAMLY SP. Z O.O.","company_id": 1074164809,"company_url": "https://pracodawcy.pracuj.pl/company/1074164809","company_logo_url": "https://logos.gpcdn.pl/loga-firm/1074164809/..._280x280.png","offer_url": "https://www.pracuj.pl/praca/backend-python-lead-warszawa,oferta,1004652624","offer_id": 1004652624,"location": "Warszawa","is_whole_poland": true,"all_locations": [{"location": "Warszawa","url": "https://www.pracuj.pl/praca/backend-python-lead-warszawa,oferta,1004652624","offer_id": 1004652624}],"salary": "24 000–32 000 zł netto (+ VAT) / mies.","job_description": "Design and develop RESTful APIs using Python...","employment_type": ["Kontrakt B2B"],"work_modes": ["Praca zdalna"],"experience_level": ["Kierownik / Kierowniczka"],"work_schedule": ["Pełny etat"],"posted_at": "2026-02-28T13:52:00Z","expiration_date": "2026-03-19T22:59:59Z","is_remote": true,"is_super_offer": true,"is_one_click_apply": true,"ai_summary": "<ul><li>Masz 8+ lat doświadczenia w rozwoju REST API...</li></ul>"}
Tips
- Get only fresh offers: Add
?rd=0to any URL to filter for last 24 hours. - Multiple searches: Pass multiple URLs in the
urlsarray — they run concurrently. - Limit results: Set
max_items_per_urlto control costs. Each page has ~20 offers. - Proxies: The actor works without proxies for small runs. For large-scale scraping, enable Apify proxy to avoid rate limiting.
How it works (technical)
- Opens each URL in a headless Firefox browser via Playwright
- Waits for the results section to render
- Extracts job data from embedded JSON in
<script>tags (primary path — fast and reliable) - Falls back to DOM parsing with CSS selectors if no JSON is found
- Automatically follows pagination until
max_items_per_urlis reached or no more pages exist - Pushes each offer to the Apify dataset
Why Playwright?
Pracuj.pl is a React SPA behind Cloudflare managed challenge protection. Simple HTTP requests receive a 403 with a JavaScript challenge. A real browser is required to render the page and extract data.