Njuskalo.hr Property Scraper
Pricing
from $3.50 / 1,000 results
Njuskalo.hr Property Scraper
Scrape real estate listings from Njuskalo.hr — Croatia's #1 classifieds portal. Extract apartments, houses, land, commercial space and garages (sale or rent) by city or county, with price (EUR), surface area, rooms, parsed Croatian address (city → district → micro-location), image and listing labels
Pricing
from $3.50 / 1,000 results
Rating
0.0
(0)
Developer
Logiover
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Scrape real estate listings from Njuskalo.hr — Croatia's #1 classifieds portal — into a clean, structured Apify dataset.
This actor extracts apartments, houses, land, commercial space, garages and rooms (sale or rent) by location, price (EUR), area and rooms. Each record includes price, surface area, parsed Croatian address (city → municipality → district), advertiser snippet, image and listing labels.
Features
- 🇭🇷 Croatia-wide coverage — every Njuskalo SEO city slug is supported (Zagreb, Split, Rijeka, Osijek, Pula, Zadar, Šibenik, Dubrovnik, …).
- 💰 Native EUR pricing with auto-computed
pricePerSqm. - 🏘️ Location chain parsing — city, municipality, district and neighborhood split from the listing's location string.
- 🏷️ Sponsored / VauVau detection — premium card flags exposed via
isExclusiveandlabels. - 🛡️ Anti-bot bypass via Playwright — headless Chromium with fingerprint rotation, residential proxy, cookie-banner handling and session retirement on block detection.
- 🔁 De-duplication & dynamic pagination — stops automatically when pages return no new IDs.
Architecture
Njuskalo deploys aggressive bot detection (Cloudflare-grade). Plain HTTP scraping does not work — the actor uses a real headless Chromium browser via Crawlee's PlaywrightCrawler:
- Browser session with rotated fingerprints (Chrome, Windows/macOS, hr-HR locale) and residential proxy.
- Heavy-asset blocking — images, media and fonts are aborted at the network layer to make pages 3–5× faster.
- Cookie banner handling — Didomi consent button auto-clicked on first navigation.
- Listing extraction — full HTML retrieved once per page, then parsed with
cheerio. Cards live under.EntityList--ListItemRegularAd .EntityList-item(regular) and.EntityList--VauVau .EntityList-item(sponsored). - Field parsing — title, detail URL, price (
strong.price--hrk— legacy class kept after EUR migration), area & rooms (fromdl/dt/ddor free-text "Lokacija: …" / "X m²"), date posted (<time>), image, location chain. - Block detection — short responses, captcha/Cloudflare titles or zero
EntityList-itemmatches retire the session and trigger a retry with a fresh proxy IP. - Pagination — dynamic: each successful page enqueues
?page=N+1until empty or cap.
Input
| Field | Type | Default | Notes |
|---|---|---|---|
locationSlugs | string[] | ["zagreb"] | Croatian city slugs appended to the category URL. Examples: zagreb, split, rijeka, osijek, zadar, pula, sibenik, dubrovnik, karlovac, varazdin, slavonski-brod, velika-gorica. Empty array = nationwide search. |
transaction | sale | rent | sale | Maps to prodaja / iznajmljivanje. |
propertyType | enum | apartment | apartment (stan) / house (kuća) / land (zemljište) / commercial (poslovni prostor) / garage (garaža) / room (soba) / vacation (vikendica). |
priceMin / priceMax | int (EUR) | 0 | 0 = no bound. |
areaMin / areaMax | int (m²) | 0 | 0 = no bound. |
roomsMin / roomsMax | int | 0 | 0 = no bound. |
maxListings | int | 200 | Total cap across all tasks. 0 = unlimited. |
maxPagesPerTask | int | 10 | Pagination depth per location (≈ 25 listings per page). |
requestDelay | int (ms) | 2500 | Inter-page delay. ≥ 2000 ms strongly recommended — Njuskalo throttles aggressively. |
maxRetries | int | 3 | Retries per page on errors / block detection (rotates session). |
proxyConfiguration | proxy | RESIDENTIAL + HR | Required. Datacenter IPs are blocked instantly; residential country=HR is mandatory. |
Example input
{"locationSlugs": ["zagreb", "split"],"transaction": "sale","propertyType": "apartment","priceMin": 100000,"priceMax": 350000,"areaMin": 40,"areaMax": 100,"roomsMin": 2,"roomsMax": 4,"maxListings": 500,"maxPagesPerTask": 25,"requestDelay": 2500,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "HR"}}
URL pattern
https://www.njuskalo.hr/{prodaja|iznajmljivanje}-{stanova|kuca|zemljista|...}[-{location-slug}]?cijenaMin=<int>&cijenaMax=<int>&povrsinaMin=<int>&povrsinaMax=<int>&sobeMin=<int>&sobeMax=<int>&page=<N>
URL filter parameters (cijenaMin, etc.) are best-effort — the actor also runs the same min/max filters in-memory after parsing each card, so result correctness does not depend on Njuskalo respecting the query string.
Output
One Apify dataset record per listing. Headline fields:
| Field | Description |
|---|---|
adId | Njuskalo numeric listing ID |
detailUrl | Full URL to the listing page |
title | Listing title |
shortDescription | Card-level description snippet |
transactionType | sale / rent |
propertyType | apartment / house / … |
price | Price in EUR (number) |
priceCurrency | EUR (rare legacy HRK listings handled) |
pricePerSqm | Price per m² (parsed or computed) |
areaSqm | Usable surface area (m²) |
terrainAreaSqm | Plot area (houses, land) |
rooms | Number of rooms |
floor / totalFloors | When present in the card |
buildingType / heatingType / yearBuilt | When present in the card |
country | Hrvatska |
city / municipality / district / microLocation | Parsed from the listing's "Lokacija" string |
fullAddress | Composed from the four location parts |
mainImageUrl / imageUrls / imageCount | Cover photo (gallery requires detail-page fetch) |
isExclusive | true for sponsored / VauVau cards |
labels | Tier tags (VauVau, Premium, Featured, Top, …) |
datePosted | Date string from <time> |
searchTransaction searchPropertyType searchLocation searchUrl | Echo of the input search parameters |
scrapedAt | ISO-8601 scrape timestamp |
Two dataset views are pre-configured: Overview (compact) and Full Detail.
Detail-only fields —
latitude,longitude, fullimageUrlsgallery,roomsLabel,street,advertiserId,advertiserName,advertiserUrl, feature flags (hasElevator,hasParking,hasGarage,hasTerrace,hasBalcony,isFurnished,isRegistered) are populated from Njuskalo's individual listing pages, not the search results. The list-endpoint scraper leaves them asnull— open a feature request if you need detail-page enrichment.
Important notes
- Anti-bot is aggressive. This is the most heavily protected site in this scraper family. Even with residential HR proxies, expect occasional
[BLOCKED]warnings — the actor handles them by retiring the session and retrying with a new IP. KeeprequestDelay≥ 2000 ms and concurrency at 1. - Throughput is browser-bound. Plan for ~6–10 listings per second on a clean session, dropping to ~1–2/s when bot detection fires. A 200-listing run typically takes 1–3 minutes; a 1,000-listing run can take 10–20 minutes.
- EUR is the standard currency — Croatia adopted EUR in January 2023. The legacy CSS class
price--hrkwas kept on the price element after migration; the actor treats its content as EUR by default and only flagsHRKif the symbol explicitly appears. - Rooms semantics — Njuskalo uses integer room counts (1, 2, 3, …) unlike Halooglasi (which uses 0.5 increments). Half-room "garsonjera" listings appear under
rooms: 1orrooms: null. - Pagination cap — Njuskalo paginates indefinitely but new content is rare past page 50. For wide queries, narrow with
priceMin/MaxandareaMin/Maxto slice the inventory rather than crawling deep.
Common location slugs
Major cities — zagreb, split, rijeka, osijek, zadar, pula, slavonski-brod, karlovac, varazdin, sibenik, dubrovnik, bjelovar, kastav, koprivnica, vinkovci, velika-gorica, vukovar, samobor, sisak, crikvenica, pozega, metkovic, cakovec
Coastal/Adriatic — opatija, rovinj, porec, umag, medulin, fazana, crikvenica, novi-vinodolski, omis, trogir, kastela, makarska, nin, biograd-na-moru, sukosan, vodice, primosten, cavtat
Islands — krk, cres, mali-losinj, rab, pag, brac, hvar, vis, korcula, mljet
To find more, browse a category page on njuskalo.hr and copy the city slug from the URL between {category-slug}- and ?.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
[BLOCKED] suspected anti-bot page repeated | IP fingerprint flagged | Increase requestDelay to 4000+ ms, raise maxRetries |
| 0 listings parsed but page loads | Cookie banner re-prompted | Already handled — verify Didomi click selector still #didomi-notice-agree-button |
| Only sponsored ("VauVau") cards saved | Regular ad list selector changed | Selector list in parseEntity covers known variants — open an issue if the site updates |
latitude / longitude always null | List endpoint doesn't expose coordinates | Expected — would require per-ad detail fetch |
| Throughput much lower than expected | Heavy anti-bot triggering | Drop concurrency to 1 (default), increase delay, ensure RESIDENTIAL+HR proxy |
License
Apache-2.0