# Polish Real Estate Multi-Portal Aggregator (`czub_w/pl-real-estate-scraper`) Actor

Scrape Otodom and Morizon in one run. Automatic cross-portal deduplication, unified schema, incremental mode. $2.50/1,000 results.

- **URL**: https://apify.com/czub\_w/pl-real-estate-scraper.md
- **Developed by:** [Wiktor](https://apify.com/czub_w) (community)
- **Categories:** Agents, Automation, Real estate
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.50 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Polish Real Estate Multi-Portal Aggregator

Scrape **Otodom** and **Morizon** — Poland's two largest real estate portals — in a single run. Get a unified, deduplicated dataset of property listings across all major cities, property types, and transaction types. No coding required.

### What does Polish Real Estate Multi-Portal Aggregator do?

This Actor simultaneously collects property listings from multiple Polish real estate portals and merges them into one clean dataset. It handles sale and rental listings for apartments, houses, plots, and commercial spaces across any Polish city. Cross-portal duplicates are automatically removed using an MD5 fingerprint, so you never pay for or process the same listing twice.

Results are available as structured JSON via the Apify dataset API, downloadable as CSV or Excel, and ready for integration with any tool or workflow.

### Why choose this over single-portal scrapers?

Most Apify Store scrapers target one portal at a time. That means you need to:

- Run and pay for two or three separate Actors
- Merge datasets manually with custom code
- Deduplicate listings yourself — the same property often appears on all major portals simultaneously
- Maintain separate schedules, inputs, and API calls

**This Actor solves all of that in one run.**

| | Single-portal scraper × 2 | This Actor |
|---|---|---|
| Portals covered | 1 each | Otodom + Morizon (Gratka coming soon) |
| Deduplication | Manual | Automatic (MD5 fingerprint) |
| Unified schema | No | Yes |
| Incremental mode | Rarely | Built-in |
| Cost for 2,000 results | ~$5–8 | **$5.00** |
| Setup complexity | High | One input form |

### Why use Polish Real Estate Multi-Portal Aggregator?

- **Automatic cross-portal deduplication** — listings appearing on both Otodom and Morizon are fingerprinted using MD5(address + area + price) and stored only once, giving you a true count of unique properties on the market.
- **Incremental mode** — on repeated runs, only listings not seen in the previous run are saved. Perfect for daily monitoring without re-processing the entire market.
- **Unified output schema** — every listing has the same fields regardless of source portal. Your downstream pipeline never needs portal-specific logic.
- **Fast HTTP crawling** — uses CheerioCrawler (no browser) for Otodom and Morizon, processing pages 10× faster than Playwright-based scrapers.
- **Use cases**: market price analysis, investment scouting, agency lead generation, automated new-listing alerts, academic research, competitive intelligence dashboards.

### How to use Polish Real Estate Multi-Portal Aggregator

1. Open the Actor on [Apify Console](https://console.apify.com) and click **Try for free**.
2. In the **Input** tab, set your search criteria: city, property type, transaction type, and optional price/area filters.
3. Set `maxListings` to limit results per portal per run (default: 100).
4. Click **Start**. A typical run collecting 100 listings takes under 2 minutes.
5. Open the **Output** tab to preview results, or download as JSON, CSV, or Excel.
6. Use the **API** tab to pull results into your application, spreadsheet, or BI tool.

For recurring monitoring, schedule the Actor via the [Apify Scheduler](https://docs.apify.com/scheduler) and enable `incrementalMode` to receive only new listings each time.

### Input

All fields have sensible defaults. Only `location` is required for a meaningful run.

| Field | Type | Default | Description |
|---|---|---|---|
| `portals` | array | `["otodom","morizon"]` | Portals to scrape. Supported values: `"otodom"`, `"morizon"` |
| `transactionType` | string | `"sprzedaz"` | `"sprzedaz"` (for sale) or `"wynajem"` (for rent) |
| `propertyType` | string | `"mieszkanie"` | `"mieszkanie"` (apartment), `"dom"` (house), `"dzialka"` (plot), `"lokal"` (commercial) |
| `location` | string | `"warszawa"` | Polish city name, lowercase, no diacritics (e.g. `"krakow"`, `"wroclaw"`, `"gdansk"`) |
| `priceMin` | integer | — | Minimum total price in PLN |
| `priceMax` | integer | — | Maximum total price in PLN |
| `areaMin` | integer | — | Minimum area in m² |
| `maxListings` | integer | `100` | Maximum listings to collect per portal per run |
| `incrementalMode` | boolean | `false` | When `true`, only listings not seen in the previous run are saved |
| `proxyConfiguration` | object | — | Apify proxy settings. Recommended for high-volume production runs |

**Example input — apartments for sale in Kraków, 400k–800k PLN:**

```json
{
  "portals": ["otodom", "morizon"],
  "transactionType": "sprzedaz",
  "propertyType": "mieszkanie",
  "location": "krakow",
  "priceMin": 400000,
  "priceMax": 800000,
  "areaMin": 40,
  "maxListings": 100,
  "incrementalMode": false
}
````

### Output

Each item in the dataset represents one unique property listing. Duplicates across portals are removed before saving. You can download the dataset in various formats such as JSON, CSV, or Excel.

**Example output item:**

```json
{
  "id": "otodom-66174382",
  "portal": "otodom",
  "url": "https://www.otodom.pl/pl/oferta/nowoczesne-mieszkanie-3-pokoje-kazimierz-66174382",
  "title": "Nowoczesne mieszkanie 3 pokoje, Kazimierz, widok na Wisłę",
  "priceTotal": 649000,
  "priceCurrency": "PLN",
  "pricePerM2": 12980,
  "areaM2": 50.0,
  "rooms": 3,
  "floor": 2,
  "totalFloors": 5,
  "isPrivate": false,
  "agencyName": "Kraków Premium Nieruchomości",
  "street": "Józefa",
  "district": "Kazimierz",
  "city": "Kraków",
  "lat": 50.0516,
  "lng": 19.9476,
  "photos": [
    "https://ireland.apollo.olxcdn.com/v1/files/abc123/image;s=1080x720",
    "https://ireland.apollo.olxcdn.com/v1/files/def456/image;s=1080x720"
  ],
  "datePosted": "2025-05-10T08:22:00.000Z",
  "scrapedAt": "2025-05-21T10:34:22.000Z"
}
```

### Data table

| Field | Type | Description |
|---|---|---|
| `id` | string | Unique listing ID, prefixed by portal (e.g. `otodom-12345`, `morizon-67890`) |
| `portal` | string | Source portal: `otodom` or `morizon` |
| `url` | string | Direct URL to the full listing page |
| `title` | string | Listing title as displayed on the portal |
| `priceTotal` | integer | Total asking price in PLN |
| `priceCurrency` | string | Always `"PLN"` |
| `pricePerM2` | integer | Price per square metre in PLN |
| `areaM2` | float | Total usable area in m² |
| `rooms` | integer | Number of rooms (not including kitchen/bathroom) |
| `floor` | integer | Floor number (`0` = ground floor, `null` if not specified) |
| `totalFloors` | integer | Total number of floors in the building |
| `isPrivate` | boolean | `true` = private owner listing, `false` = agency |
| `agencyName` | string | Name of the listing agency, or `null` for private listings |
| `street` | string | Street name |
| `district` | string | District or neighbourhood |
| `city` | string | City |
| `lat` | float | Latitude (GPS) |
| `lng` | float | Longitude (GPS) |
| `photos` | array | Array of photo URLs (up to 3 per listing) |
| `datePosted` | ISO 8601 | Date the listing was originally published |
| `scrapedAt` | ISO 8601 | Timestamp when the listing was collected |

### Pricing

**$2.50 per 1,000 results** — cheaper than running two separate single-portal scrapers.

| Scenario | Listings | Estimated cost |
|---|---|---|
| Quick city snapshot | 200 (2 portals × 100) | ~$0.50 |
| Standard market scan | 1,000 | ~$2.50 |
| Full dual-portal sweep | 2,000 | **$5.00** |
| Daily incremental monitor | 20–50 new listings/day | <$0.15/day |

Incremental mode is the most cost-efficient setup for ongoing monitoring: only genuinely new listings are saved each run, so you are never charged for data you already have.

### Tips and advanced options

**Use incremental mode for daily alerts** — set `incrementalMode: true` and schedule the Actor to run each morning. You will only receive listings that appeared since the previous run. Pair with an [Apify webhook](https://docs.apify.com/integrations/webhooks) to push new results directly to Slack, email, or your CRM.

**Set realistic `maxListings` values** — in major cities like Warsaw or Kraków there can be thousands of listings. Start with `100`–`500` to validate your filters before running a full scan.

**Use proxies for high-volume runs** — configuring `proxyConfiguration` with Apify Residential Proxies prevents rate limiting when collecting thousands of listings. For runs under ~200 pages this is usually not necessary.

**Combine with Google Sheets or Airtable** — use [Apify Integrations](https://apify.com/integrations) to automatically sync new results into a spreadsheet after each run, with zero code.

**Filter aggressively to reduce cost** — combining `priceMin`, `priceMax`, and `areaMin` narrows results to exactly your target segment, reducing both run time and compute units consumed.

### FAQ, disclaimers, and support

**Is scraping Otodom and Morizon legal?**
This Actor collects publicly available listing data in the same way a user would browse the sites. Always review each portal's Terms of Service before using data for commercial purposes. The Actor implements rate limiting and respects `robots.txt` to avoid placing undue load on target servers.

**Why are some fields `null`?**
Not all portals expose every data point. GPS coordinates and `datePosted` are reliably available on Otodom but may be absent on Morizon for certain listing types.

**When is Gratka.pl support coming?**
Gratka.pl requires a JavaScript-capable crawler (Playwright) because its listing pages are rendered client-side. This is actively in development and will be added in an upcoming release.

**The Actor returned fewer results than `maxListings`. Why?**
This can happen if the portal has fewer listings than requested for your search criteria, or if some pages were blocked and retries were exhausted. Try enabling `proxyConfiguration` for more reliable results.

**I need a custom solution or have a feature request.**
Open an issue on the [Issues tab](../../issues). For enterprise use cases or custom portal integrations, contact us directly.

# Actor input Schema

## `startUrls` (type: `array`):

URLs to start with.

## `maxRequestsPerCrawl` (type: `integer`):

Maximum number of requests that can be made by this crawler.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://crawlee.dev"
    }
  ],
  "maxRequestsPerCrawl": 100
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://crawlee.dev"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("czub_w/pl-real-estate-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [{ "url": "https://crawlee.dev" }] }

# Run the Actor and wait for it to finish
run = client.actor("czub_w/pl-real-estate-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://crawlee.dev"
    }
  ]
}' |
apify call czub_w/pl-real-estate-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=czub_w/pl-real-estate-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Polish Real Estate Multi-Portal Aggregator",
        "description": "Scrape Otodom and Morizon in one run. Automatic cross-portal deduplication, unified schema, incremental mode. $2.50/1,000 results.",
        "version": "0.0",
        "x-build-id": "Mn4ru4PT9itirZdSh"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/czub_w~pl-real-estate-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-czub_w-pl-real-estate-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/czub_w~pl-real-estate-scraper/runs": {
            "post": {
                "operationId": "runs-sync-czub_w-pl-real-estate-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/czub_w~pl-real-estate-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-czub_w-pl-real-estate-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "URLs to start with.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxRequestsPerCrawl": {
                        "title": "Max Requests per Crawl",
                        "type": "integer",
                        "description": "Maximum number of requests that can be made by this crawler.",
                        "default": 100
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
