Zillow Real Estate Scraper avatar
Zillow Real Estate Scraper
Deprecated
View all Actors
This Actor is deprecated

This Actor is unavailable because the developer has decided to deprecate it. Would you like to try a similar Actor instead?

See alternative Actors
Zillow Real Estate Scraper

Zillow Real Estate Scraper

petr_cermak/zillow-api-scraper

Our free Zillow scraper lets you extract data about properties for sale and rent on Zillow using the Zillow API, but with no daily call limits. Scrape millions of listings and download your data as HTML, JSON, CSV, Excel, XML, and RSS feed.

User avatar

Fining full results, but not extracting

Closed

immovable_leopard opened this issue
2 years ago

I'm having trouble getting all of the listings out of the scraper. It indicates all results are found, and then only returns a portion of the listings: 2022-02-23T15:59:41.038Z INFO Found 324 results, pagination pages will be enqueued. 2022-02-23T15:59:41.040Z INFO Extracted total 233 2022-02-23T15:59:41.159Z INFO PuppeteerCrawler: All the requests from request list and/or request queue have been processed, the crawler will shut down. 2022-02-23T15:59:41.515Z INFO PuppeteerCrawler: Final request statistics: {"requestsFinished":3,"requestsFailed":0,"retryHistogram":[3],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":60204,"requestsFinishedPerMinute":2,"requestsFailedPerMinute":0,"requestTotalDurationMillis":180611,"requestsTotal":3,"crawlerRuntimeMillis":105913} 2022-02-23T15:59:41.517Z INFO Done with 233 listings!

I am starting from a search URL and have jacked up the zoom levels. I have previously had good success with getting most of the listings, then a few weeks ago Zillow ReCaptcha caused some infinite loop runs that wiped out my CU hard limits, as i didn't have timeouts set. Since then, I have struggled to get a good return on listings.

User avatar

immovable_leopard

2 years ago

{ "maxItems": 100000, "simple": true, "maxLevel": 1000000, "proxyConfiguration": { "useApifyProxy": true }, "maxRetries": 1000, "handlePageTimeoutSecs": 7200, "stealth": false, "debugLog": false, "extendOutputFunction": "async ({ item, data}) => {\nitem.lotSize=data.lotSize\nitem.isListedByOwner=data.isListedByOwner\nitem.photos = undefined;\nitem.currency = undefined;\nitem.address.community = undefined;\nitem.address.subdivision = undefined;\nitem.address.neighborhood = undefined;\nitem.description = undefined;\nreturn item;\n}", "extendScraperFunction": "async ({ label, page, request, customData, Apify }) => {\n if (label === 'SETUP') {\n // before crawler.run()\n } else if (label === 'GOTO') {\n // inside handleGotoFunction\n } else if (label === 'HANDLE') {\n // inside handlePageFunction\n } else if (label === 'FINISH') {\n // after crawler.run()\n }\n}", "startUrls": [ { "url": "https://www.zillow.com/wi/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22usersSearchTerm%22%3A%22WI%22%2C%22mapBounds%22%3A%7B%22west%22%3A-97.89712721874999%2C%22east%22%3A-80.62661940624999%2C%22south%22%3A41.83898737154406%2C%22north%22%3A50.77443091010166%7D%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A60%2C%22regionType%22%3A2%7D%5D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22lot%22%3A%7B%22min%22%3A4356000%7D%2C%22sort%22%3A%7B%22value%22%3A%22globalrelevanceex%22%7D%2C%22ah%22%3A%7B%22value%22%3Atrue%7D%7D%2C%22isListVisible%22%3Atrue%2C%22mapZoom%22%3A6%7D" } ], "type": "all", "customData": {} }

User avatar

Hey I checked and it does the same for me, I am asking the dev to have a look into it.

User avatar

illuminating_spider

2 years ago

I am having the same issue, and I'm starting from a specified city instead of a url.

User avatar

Hey, we are looking into it, hopefully we will find a solution soon.

User avatar

current_hatchet

2 years ago

I had this problem last night on two sample runs. Thanks!

User avatar

conancallahan

2 years ago

I ran the same issue today.

User avatar

this should be fixed in latest version released yesterday

Developer
Maintained by Community