Njuskalo.hr Property Scraper avatar

Njuskalo.hr Property Scraper

Pricing

from $3.50 / 1,000 results

Go to Apify Store
Njuskalo.hr Property Scraper

Njuskalo.hr Property Scraper

Scrape real estate listings from Njuskalo.hr — Croatia's #1 classifieds portal. Extract apartments, houses, land, commercial space and garages (sale or rent) by city or county, with price (EUR), surface area, rooms, parsed Croatian address (city → district → micro-location), image and listing labels

Pricing

from $3.50 / 1,000 results

Rating

0.0

(0)

Developer

Logiover

Logiover

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Scrape real estate listings from Njuskalo.hr — Croatia's #1 classifieds portal — into a clean, structured Apify dataset.

This actor extracts apartments, houses, land, commercial space, garages and rooms (sale or rent) by location, price (EUR), area and rooms. Each record includes price, surface area, parsed Croatian address (city → municipality → district), advertiser snippet, image and listing labels.


Features

  • 🇭🇷 Croatia-wide coverage — every Njuskalo SEO city slug is supported (Zagreb, Split, Rijeka, Osijek, Pula, Zadar, Šibenik, Dubrovnik, …).
  • 💰 Native EUR pricing with auto-computed pricePerSqm.
  • 🏘️ Location chain parsing — city, municipality, district and neighborhood split from the listing's location string.
  • 🏷️ Sponsored / VauVau detection — premium card flags exposed via isExclusive and labels.
  • 🛡️ Anti-bot bypass via Playwright — headless Chromium with fingerprint rotation, residential proxy, cookie-banner handling and session retirement on block detection.
  • 🔁 De-duplication & dynamic pagination — stops automatically when pages return no new IDs.

Architecture

Njuskalo deploys aggressive bot detection (Cloudflare-grade). Plain HTTP scraping does not work — the actor uses a real headless Chromium browser via Crawlee's PlaywrightCrawler:

  1. Browser session with rotated fingerprints (Chrome, Windows/macOS, hr-HR locale) and residential proxy.
  2. Heavy-asset blocking — images, media and fonts are aborted at the network layer to make pages 3–5× faster.
  3. Cookie banner handling — Didomi consent button auto-clicked on first navigation.
  4. Listing extraction — full HTML retrieved once per page, then parsed with cheerio. Cards live under .EntityList--ListItemRegularAd .EntityList-item (regular) and .EntityList--VauVau .EntityList-item (sponsored).
  5. Field parsing — title, detail URL, price (strong.price--hrk — legacy class kept after EUR migration), area & rooms (from dl/dt/dd or free-text "Lokacija: …" / "X m²"), date posted (<time>), image, location chain.
  6. Block detection — short responses, captcha/Cloudflare titles or zero EntityList-item matches retire the session and trigger a retry with a fresh proxy IP.
  7. Pagination — dynamic: each successful page enqueues ?page=N+1 until empty or cap.

Input

FieldTypeDefaultNotes
locationSlugsstring[]["zagreb"]Croatian city slugs appended to the category URL. Examples: zagreb, split, rijeka, osijek, zadar, pula, sibenik, dubrovnik, karlovac, varazdin, slavonski-brod, velika-gorica. Empty array = nationwide search.
transactionsale | rentsaleMaps to prodaja / iznajmljivanje.
propertyTypeenumapartmentapartment (stan) / house (kuća) / land (zemljište) / commercial (poslovni prostor) / garage (garaža) / room (soba) / vacation (vikendica).
priceMin / priceMaxint (EUR)00 = no bound.
areaMin / areaMaxint (m²)00 = no bound.
roomsMin / roomsMaxint00 = no bound.
maxListingsint200Total cap across all tasks. 0 = unlimited.
maxPagesPerTaskint10Pagination depth per location (≈ 25 listings per page).
requestDelayint (ms)2500Inter-page delay. ≥ 2000 ms strongly recommended — Njuskalo throttles aggressively.
maxRetriesint3Retries per page on errors / block detection (rotates session).
proxyConfigurationproxyRESIDENTIAL + HRRequired. Datacenter IPs are blocked instantly; residential country=HR is mandatory.

Example input

{
"locationSlugs": ["zagreb", "split"],
"transaction": "sale",
"propertyType": "apartment",
"priceMin": 100000,
"priceMax": 350000,
"areaMin": 40,
"areaMax": 100,
"roomsMin": 2,
"roomsMax": 4,
"maxListings": 500,
"maxPagesPerTask": 25,
"requestDelay": 2500,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "HR"
}
}

URL pattern

https://www.njuskalo.hr/{prodaja|iznajmljivanje}-{stanova|kuca|zemljista|...}[-{location-slug}]
?cijenaMin=<int>
&cijenaMax=<int>
&povrsinaMin=<int>
&povrsinaMax=<int>
&sobeMin=<int>
&sobeMax=<int>
&page=<N>

URL filter parameters (cijenaMin, etc.) are best-effort — the actor also runs the same min/max filters in-memory after parsing each card, so result correctness does not depend on Njuskalo respecting the query string.


Output

One Apify dataset record per listing. Headline fields:

FieldDescription
adIdNjuskalo numeric listing ID
detailUrlFull URL to the listing page
titleListing title
shortDescriptionCard-level description snippet
transactionTypesale / rent
propertyTypeapartment / house / …
pricePrice in EUR (number)
priceCurrencyEUR (rare legacy HRK listings handled)
pricePerSqmPrice per m² (parsed or computed)
areaSqmUsable surface area (m²)
terrainAreaSqmPlot area (houses, land)
roomsNumber of rooms
floor / totalFloorsWhen present in the card
buildingType / heatingType / yearBuiltWhen present in the card
countryHrvatska
city / municipality / district / microLocationParsed from the listing's "Lokacija" string
fullAddressComposed from the four location parts
mainImageUrl / imageUrls / imageCountCover photo (gallery requires detail-page fetch)
isExclusivetrue for sponsored / VauVau cards
labelsTier tags (VauVau, Premium, Featured, Top, …)
datePostedDate string from <time>
searchTransaction searchPropertyType searchLocation searchUrlEcho of the input search parameters
scrapedAtISO-8601 scrape timestamp

Two dataset views are pre-configured: Overview (compact) and Full Detail.

Detail-only fieldslatitude, longitude, full imageUrls gallery, roomsLabel, street, advertiserId, advertiserName, advertiserUrl, feature flags (hasElevator, hasParking, hasGarage, hasTerrace, hasBalcony, isFurnished, isRegistered) are populated from Njuskalo's individual listing pages, not the search results. The list-endpoint scraper leaves them as null — open a feature request if you need detail-page enrichment.


Important notes

  • Anti-bot is aggressive. This is the most heavily protected site in this scraper family. Even with residential HR proxies, expect occasional [BLOCKED] warnings — the actor handles them by retiring the session and retrying with a new IP. Keep requestDelay ≥ 2000 ms and concurrency at 1.
  • Throughput is browser-bound. Plan for ~6–10 listings per second on a clean session, dropping to ~1–2/s when bot detection fires. A 200-listing run typically takes 1–3 minutes; a 1,000-listing run can take 10–20 minutes.
  • EUR is the standard currency — Croatia adopted EUR in January 2023. The legacy CSS class price--hrk was kept on the price element after migration; the actor treats its content as EUR by default and only flags HRK if the symbol explicitly appears.
  • Rooms semantics — Njuskalo uses integer room counts (1, 2, 3, …) unlike Halooglasi (which uses 0.5 increments). Half-room "garsonjera" listings appear under rooms: 1 or rooms: null.
  • Pagination cap — Njuskalo paginates indefinitely but new content is rare past page 50. For wide queries, narrow with priceMin/Max and areaMin/Max to slice the inventory rather than crawling deep.

Common location slugs

Major citieszagreb, split, rijeka, osijek, zadar, pula, slavonski-brod, karlovac, varazdin, sibenik, dubrovnik, bjelovar, kastav, koprivnica, vinkovci, velika-gorica, vukovar, samobor, sisak, crikvenica, pozega, metkovic, cakovec

Coastal/Adriaticopatija, rovinj, porec, umag, medulin, fazana, crikvenica, novi-vinodolski, omis, trogir, kastela, makarska, nin, biograd-na-moru, sukosan, vodice, primosten, cavtat

Islandskrk, cres, mali-losinj, rab, pag, brac, hvar, vis, korcula, mljet

To find more, browse a category page on njuskalo.hr and copy the city slug from the URL between {category-slug}- and ?.


Troubleshooting

SymptomLikely causeFix
[BLOCKED] suspected anti-bot page repeatedIP fingerprint flaggedIncrease requestDelay to 4000+ ms, raise maxRetries
0 listings parsed but page loadsCookie banner re-promptedAlready handled — verify Didomi click selector still #didomi-notice-agree-button
Only sponsored ("VauVau") cards savedRegular ad list selector changedSelector list in parseEntity covers known variants — open an issue if the site updates
latitude / longitude always nullList endpoint doesn't expose coordinatesExpected — would require per-ad detail fetch
Throughput much lower than expectedHeavy anti-bot triggeringDrop concurrency to 1 (default), increase delay, ensure RESIDENTIAL+HR proxy

License

Apache-2.0