OLX PL Pro Scraper
Under maintenancePricing
from $0.00005 / actor start
Pricing
from $0.00005 / actor start
Rating
0.0
(0)
Developer
Jacek
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
0
Monthly active users
9 hours ago
Last modified
Categories
Share
OLX.pl Advanced Scraper
A robust web scraper built with Crawlee for Python and Playwright, specifically designed to bypass anti-bot mechanisms using the Camoufox stealth browser. This tool allows you to perform exact keyword searches on OLX.pl, filter by location and price range, and extract detailed, structured data from offer pages.
✨ Features
- Anti-Bot Evasion: Utilizes the
CamoufoxPluginto successfully navigate and scrape OLX without getting blocked. - Advanced Search Constraints: Filter offers by custom keywords, location, minimum price, and maximum price.
- Rich Data Extraction: Extracts core details from each offer:
- Title
- Price
- Description
- Location
- Seller Name
- Availability of Security Package ("Przedmiot z Pakietem Ochronnym")
- Dynamic Parameter Parsing: Automatically parses the detailed specification list (e.g., Condition, Screen Size, RAM, Drive type) into structured data.
- CSV Export: Automatically combines all scraped JSON records into a single, easy-to-read
results.csvfile with dynamically generated columns for all parameters.
uv run python -m my_crawler --query "twoje hasło" [OPTIONS]
Command Line Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
--query | str | 'rower' | The exact keyword you want to search for (spaces will automatically be formatted). |
--location | str | None | Restrict search to a specific city/location (e.g., warszawa, krakow). |
--min-price | int | None | Minimum price filter (in PLN). |
--max-price | int | None | Maximum price filter (in PLN). |
Example Commands
Search for a Macbook M1 between 2000 and 4000 PLN:
uv run python -m my_crawler --query "macbook m1" --min-price 2000 --max-price 4000
Search for a mountain bike in Warsaw:
uv run python -m my_crawler --query "rower górski" --location warszawa
📂 Output
Once the crawler finishes running, it outputs the data in two formats:
- JSON Datasets: Raw JSON files for each offer are stored in
storage/datasets/default/. - CSV Export: A fully compiled
results.csvis automatically generated in the root directory.
CSV Structure
The generated results.csv is flattened for easy analysis in Excel, Google Sheets, or Pandas. It starts with the standard fields:
url, title, price, has_security_package, seller_name, location, description
Following the standard fields, it dynamically appends columns for any custom parameters found across the dataset (e.g., Stan, Model, Wielkość pamięci RAM).
Project skeleton generated by Crawlee (Playwright-camoufox template).