Otomoto.pl Scraper avatar

Otomoto.pl Scraper

Pricing

from $4.99 / 1,000 results

Go to Apify Store
Otomoto.pl Scraper

Otomoto.pl Scraper

Pricing

from $4.99 / 1,000 results

Rating

0.0

(0)

Developer

Scraper Engine

Scraper Engine

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

πŸš— Otomoto.pl Scraper

Scrape vehicle listings from Otomoto.pl β€” Poland's #1 automotive marketplace. Pull full ad details, photos, equipment lists, seller info, and price metadata into a clean, structured dataset in real time.


✨ Why Choose This Actor?

  • πŸ›‘οΈ Anti-bot ready β€” built on Rust-backed impit for genuine browser-grade TLS + header fingerprints.
  • 🚦 Smart proxy ladder β€” starts direct, auto-escalates to datacenter, then residential (with 3 retries) if Otomoto pushes back. Once on residential, it stays sticky for the rest of the run.
  • ⚑ Async + concurrent β€” semaphore-bounded fan-out; tunable concurrency.
  • πŸ’Ύ Real-time output β€” every ad is pushed to the dataset as it's parsed. A crash mid-run still leaves you the partial results.
  • πŸ”Ž Two ways to target β€” paste full URLs or build a search from category / query / location / price / year filters.
  • πŸ“Š Four dataset views β€” Overview, Seller, Equipment & Details, and Images.

πŸš€ Key Features

  • Full advert payload (id, title, price, currency, images, description, mainFeatures, equipment groups, details, parameters dict, seller, category, packages, value-added services, price drops).
  • Pagination handled automatically (cap with maxItems).
  • Both listing URLs and individual ad URLs are supported in startUrls.
  • Polish category aliases (osobowe, dostawcze, …) and English aliases (cars, vans, …).
  • Graceful handling of Apify Store LIMITED_PERMISSIONS storages.
  • Engaging real-time logs so you can watch the run unfold.

πŸ“₯ Input

{
"startUrls": [
{ "url": "https://www.otomoto.pl/osobowe/volvo" }
],
"category": "cars",
"query": "volvo",
"location": "Warszawa",
"sort": "relevance_web",
"minPrice": 50000,
"maxPrice": 1000000,
"minProductionYear": 1999,
"maxProductionYear": 2015,
"maxItems": 100,
"concurrency": 10,
"requestDelay": 0,
"proxyConfiguration": { "useApifyProxy": false }
}
FieldTypeMeaning
startUrlsarrayListing or detail URLs. When non-empty, overrides all the search filters below.
categorystringcars / vans / trucks / motorcycles / trailers / campers / construction / agricultural
querystringBrand or model keyword (e.g. volvo).
locationstringFree-text city filter (e.g. Warszawa).
sortstringrelevance_web, created_at:desc, price:asc, …
minPrice / maxPriceintegerPrice range in PLN.
minProductionYear / maxProductionYearintegerYear range.
maxItemsintegerHard cap on ads to scrape.
concurrencyintegerParallel HTTP requests (1-50, default 10).
requestDelaynumberFloat seconds between requests; random jitter added.
proxyConfigurationobjectInitial proxy choice. Defaults to no proxy β€” auto-escalates on blocks.

πŸ“€ Output

Each row in the dataset matches the shape pushed by Actor.push_data:

{
"id": "6147721991",
"status": "ACTIVE",
"title": "Volvo V40 Cross Country T4 AWD Drive-E Momentum",
"url": "https://www.otomoto.pl/osobowe/oferta/volvo-...-ID6I3dkj.html",
"price": 52900,
"priceList": { "value": "52900", "currency": "PLN", "labels": [], "isUnderBudget": false },
"primaryImageUrl": "https://ireland.apollo.olxcdn.com/...",
"images": ["https://...", "..."],
"createdAt": "2026-05-18T11:59:02Z",
"updatedAt": "2026-05-18T06:13:22.329980Z",
"mainFeatures": ["2015", "85 000 km", "1 969 cm3", "Benzyna"],
"description": "<p>Witam ...</p>",
"seller": { "name": "Patryk", "type": "PRIVATE", "location": { "city": "Warszawa", ... } },
"equipment": [ { "key": "audio_and_computing", "values": [ ... ] } ],
"details": [ ... ],
"detailsGroups": [ ... ],
"parametersDict": { ... },
"category": { "code": "609", "label": "Osobowe" },
"isUsedCar": true,
"verifiedCar": false,
"priceDrop": null
}

The dataset ships with four prebuilt views in the Apify Console:

  • πŸš— Overview β€” title, price, status, dates, primary image
  • πŸ‘€ Seller β€” seller block per ad
  • πŸ› οΈ Equipment & Details β€” full equipment list + parameters dict
  • πŸ“Έ Images β€” primary + all image URLs

πŸš€ How to Use (Apify Console)

  1. Log in at https://console.apify.com β†’ Actors.
  2. Find Otomoto.pl Scraper and open it.
  3. Configure your input β€” paste Start URLs or set category / query / location / price / year filters.
  4. (Optional) Adjust Max items, Concurrency, and Proxy if needed.
  5. Click Start.
  6. Watch the Log tab in real time β€” you'll see ads stream in as 🟩 lines.
  7. Open the Output tab when the run completes (or while it's running).
  8. Export to JSON / CSV / XLSX with one click.

πŸ€– Use via API

Start a run via the Apify API:

curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/runs?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"startUrls": [{"url": "https://www.otomoto.pl/osobowe/volvo"}],
"maxItems": 50
}'

Or run synchronously and stream back the dataset:

curl -X POST \
"https://api.apify.com/v2/acts/<ACTOR_ID>/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"category": "cars", "query": "volvo", "maxItems": 20}'

🌐 Proxy Strategy

The actor implements a 3-stage proxy ladder:

  1. 🟒 No proxy β€” first request always tries direct.
  2. 🟑 Datacenter β€” on HTTP 403/429/5xx or an empty payload, escalate to Apify datacenter proxies.
  3. πŸ”΄ Residential β€” if datacenter also blocks, escalate to residential. Up to 3 retries here; once residential is in play, it stays sticky for the rest of the run.

You can pre-set a starting stage from proxyConfiguration (e.g. {"useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"]} to start residential), but escalation always runs if Otomoto blocks you.


πŸ’° Pricing

This actor uses Pay-Per-Event (PPE) monetization:

  • Run start β€” flat startup charge (covered by Apify's first 5 seconds free policy under documented rules).
  • Per result item β€” one chargeable event per ad successfully written to the dataset.

You only pay for ads that actually land in your dataset. Failed pages, blocked requests, and empty payloads are not billed.


🎯 Best Use Cases

  • Price-trend monitoring across vehicle categories and regions.
  • Used-car dealer competitor watch.
  • Building a Polish vehicle classifieds dataset for ML / analytics.
  • Lead generation from dealer / private-seller contact metadata.
  • Comparative valuation for fleet sales.

❓ Frequently Asked Questions

Q: Does this actor render JavaScript? A: No. Otomoto embeds the full ad payload in the page's __NEXT_DATA__ script β€” we parse that directly. Faster, lighter, and more reliable than a headless browser for this site.

Q: What happens if Otomoto blocks the run? A: We auto-escalate to datacenter, then residential proxies. The log clearly shows every proxy switch (🚦 lines). Residential is sticky once activated.

Q: Can I scrape a single ad without setting up a search? A: Yes β€” paste the full …/oferta/...-ID*.html URL into startUrls.

Q: Are partial results saved if the run crashes? A: Yes. Every ad is pushed live to the dataset, so a crash mid-run still leaves everything scraped up to that point.

Q: Does it work outside Poland? A: It targets otomoto.pl specifically (Polish marketplace). The proxy ladder is geo-agnostic, but the data is Polish-language by nature.


  • Data is collected only from publicly available Otomoto.pl pages.
  • The end user is responsible for legal compliance (GDPR, target site Terms of Service, anti-spam regulations).
  • Respect Otomoto's rate limits β€” keep concurrency and requestDelay reasonable. Good citizens get blocked less.

πŸ’¬ Support & Feedback

Found a bug or have a feature request? Open an issue in the Actor's Apify Console Issues tab, or contact the developer via the Store profile.