Extract real estate listing data from sreality.cz. Scrape structured data from the website sitemap or from the arbitrary real estate listing URLs on sreality.cz.
Default value of this property is "https://www.sreality.cz/hledani/filtr"
Sitemap start URL
sitemapStartUrlstringOptional
Sitemap URL to crawl from
Default value of this property is "https://www.sreality.cz/sitemap.xml"
Parse sitemap
parseSitemapbooleanOptional
If true, the crawler will only parse the articles from the sitemap. If false, the crawler will parse the article URLs from the pagination pages.
Default value of this property is false
Parse articles
parseArticleUrlsarrayOptional
Overrides the default behavior of extracting article URLs from the sitemap. If provided, the crawler will only parse provided article URLs.
Default value of this property is []
Include all countries
includeAllCountriesbooleanOptional
If true, all countries will be included. If false, only the countries in the includeCountries list plus Czechia will be included.
Default value of this property is false
Countries to include
includeCountriesarrayOptional
Countries to include. Ignoring all coutries except for Czechia by default. See ISO_3166-1_alpha-2 for the list of country codes. WARNING: This property won't speed up the search of required articles since it's still necessary to scrape the data from each url.
Default value of this property is ["CZ"]
Ignore URLs
ignoreUrlsarrayOptional
If provided, the crawler will ignore these article URLs.
Default value of this property is []
Dataset Name
datasetNamestringOptional
Name of the dataset to store the scraped data in.
Constraint estate types
constraintEstateTypesarrayOptional
If provided, the crawler will only parse articles with these estate types. Possible values (mapped to sreality native enums): APARTMENT (byt), HOUSE (dům), LAND (pozemek), GARAGE (ostatni -> garáž), OFFICE (komerční), NON_RESIDENTIAL (ostatní -> all except for garáž and mobilní domek), COTTAGE (dům -> chalupa), SUMMER_HOUSE (dům -> chata), OTHER (ostatní -> jiné nemovitosti or mobilní domek).
Default value of this property is []
Constraint offer types
constraintOfferTypesarrayOptional
If provided, the crawler will only parse articles with these offer types. Possible values (mapped to sreality native enums): SALE (prodej), RENT (pronajem), COLIVING (pronajem -> pokoj).
Default value of this property is []
Constraint layouts
constraintLayoutsarrayOptional
If provided, the crawler will only parse apartment articles with these layouts. Possible values (mapped to sreality native enums): ONE (mapped to both "1+kk" and "pokoj"), ONE_PLUS_ONE, TWO, TWO_PLUS_ONE, THREE, THREE_PLUS_ONE, FOUR, FOUR_PLUS_ONE, FIVE, FIVE_PLUS_ONE; SIX, SIX_PLUS_ONE, SEVEN, SEVEN_PLUS_ONE are all mapped to "6 a vice"; OTHER (mapped to atypický)
Default value of this property is []
Location
locationstringOptional
WARNING: you should set parseSitemap to false to use this option. Location to filter the listings by. This will be used to fuzzy search the location on the landing page. The first suggested location will be used.
Min price
minPriceintegerOptional
Min price to filter the listings by. If set, will ignore listings without price listed.
Max price
maxPriceintegerOptional
Max price to filter the listings by.
Min area
minAreaintegerOptional
Min area to filter the listings by. If set, will ignore listings without area listed.
Max area
maxAreaintegerOptional
Max area to filter the listings by.
Published from
publishedFromstringOptional
Published from date to filter the listings by. Format: YYYY-MM-DD
Published to
publishedTostringOptional
Published to date to filter the listings by. Format: YYYY-MM-DD
Max articles
maxArticlesintegerOptional
Max number of articles to scrape. If set, the crawler will stop after scraping this number of articles. Set to -1 to scrape all articles.