Zoopla.co.uk Scraper avatar
Zoopla.co.uk Scraper
Try for free

3 days trial then $30.00/month - No credit card required now

View all Actors
Zoopla.co.uk Scraper

Zoopla.co.uk Scraper

dhrumil/zoopla-scraper
Try for free

3 days trial then $30.00/month - No credit card required now

Scrape Zoopla.co.uk to crawl millions of sale/rent real estate properties from United Kingdom. Our real estate scraper also lets you monitor specific listing for new updates/listing. You can provide multiple search result listings to scrape/monitor.

User avatar

Commercial URLs not parsing

Closed

quiche opened this issue
2 months ago

Hi Dhrumil - thanks for jumping on this so quickly. Unfortunately you have closed the previous issue so I cannot reply in the same ticket. The following runs have both failed (input is a single commercial URL, for sale in the first and to rent in the second...

https://console.apify.com/actors/teOj85DgYAQSZUKeJ/runs/bkFm3FI891Bi9BndQ https://console.apify.com/actors/teOj85DgYAQSZUKeJ/runs/qtmvz6lPjXxKV6qcL

User avatar

This was an update specific to commercial properties. I have applied it and it should work now. Please confirm and I will close the issue after that.

User avatar

quiche

2 months ago

I'm still getting the same error - I assume I don't need to do anything to update the scraper I just start a new run correct?

2024-03-26T21:15:36.400Z INFO Page opened. {"url":"https://www.zoopla.co.uk/to-rent/commercial/property/AB10/?page_size=100"} 2024-03-26T21:15:44.673Z ERROR PuppeteerCrawler: handleRequestFunction failed, reclaiming failed request back to the list or queue {"url":"https://www.zoopla.co.uk/to-rent/commercial/property/AB10/?page_size=100","retryCount":1,"id":"wARZrT3jp0i6HRJ"} 2024-03-26T21:15:44.676Z Error: Evaluation failed: TypeError: Cannot read properties of undefined (reading 'split') 2024-03-26T21:15:44.678Z at

https://console.apify.com/actors/teOj85DgYAQSZUKeJ/runs/SNQxCB9PkdPu7dBeN#output

User avatar

Sorry, there was new version pending to be published for this fixed. It's published now. No, you don't need to do anything specific. By default it will use latest version always.

User avatar

quiche

2 months ago

Ok - that's improved it but still not quite there. No longer getting the split error but it isn't picking up any properties on the list (there should be 51):

2024-03-27T08:25:35.848Z ACTOR: Pulling Docker image of build yoNTqkuuhY6bt7Wwt from repository. 2024-03-27T08:26:10.679Z ACTOR: Creating Docker container. 2024-03-27T08:26:11.218Z ACTOR: Starting Docker container. 2024-03-27T08:26:11.813Z Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp 2024-03-27T08:26:11.815Z Executing main command 2024-03-27T08:26:13.571Z INFO System info {"apifyVersion":"2.3.1","apifyClientVersion":"2.6.1","osType":"Linux","nodeVersion":"v16.20.2"} 2024-03-27T08:26:14.113Z INFO Starting the crawl. 2024-03-27T08:26:14.165Z INFO PuppeteerCrawler:AutoscaledPool: state {"currentConcurrency":0,"desiredConcurrency":2,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":null},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":null},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":null},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":null}}} 2024-03-27T08:26:18.019Z INFO Page opened. {"url":"https://www.zoopla.co.uk/to-rent/commercial/property/AB10/?page_size=100"} 2024-03-27T08:26:18.024Z INFO Total pages: null 2024-03-27T08:26:18.196Z INFO properties on list page : 0 2024-03-27T08:26:18.293Z INFO PuppeteerCrawler: All the requests from request list and/or request queue have been processed, the crawler will shut down. 2024-03-27T08:26:18.741Z INFO PuppeteerCrawler: Final request statistics: {"requestsFinished":1,"requestsFailed":0,"retryHistogram":[1],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":4006,"requestsFinishedPerMinute":13,"requestsFailedPerMinute":0,"requestTotalDurationMillis":4006,"requestsTotal":1,"crawlerRuntimeMillis":4630} 2024-03-27T08:26:18.743Z INFO Crawl finished.

User avatar

This was page getting blocked. I have started rotating and reattempting proxy with this scenario now. Please try again.

User avatar

quiche

2 months ago

Great - 51 pages processed as expected. I'll try a larger batch tomorrow. Many thanks - will close this issue.

Developer
Maintained by Community
Actor metrics
  • 6 monthly users
  • 91.7% runs succeeded
  • 0.6 days response time
  • Created in Dec 2022
  • Modified 10 days ago