Web Scraper
No credit card required
Web Scraper
No credit card required
Crawls arbitrary websites using the Chrome browser and extracts data from pages using JavaScript code. The Actor supports both recursive crawling and lists of URLs and automatically manages concurrency for maximum performance. This is Apify's basic tool for web crawling and scraping.
Do you want to learn more about this Actor?
Get a demoIt looks like it's crawling the page correctly but I can't figure out why this error is occurring and I'd prefer to preserve my usage
https://console.apify.com/actors/moJRLRc85AitArpNN/runs/An5339u0xNa1UKP7D#log
Hello, and thank you for your interest in this Actor!
The issue you describe seems to appear randomly. It might be related to the asynchronous requests you are making inside of the Page Function. Unfortunately, I cannot provide much more help with your custom code, as I don't know what you are trying to achieve.
As a quick remedy, you can also bump the requestHandler
timeout by increasing the value in the Performance and Limits > Page Function timeout
input option.
By the way - the website you are scraping seems completely server-side rendered (and static, i.e., without client-side JS). This means you can process it with our Cheerio Scraper as well. This Actor is much faster than Web Scraper, as it doesn't use web browsers to load the page (it uses a simple HTTP request and an HTML parser instead). I see that most of your custom code uses jQuery - migrating this to Cheerio should be fairly easy, as Cheerio supports a fairly comprehensive subset of jQuery syntax.
Migrating to Cheerio Scraper should give you your results much faster (up to 20x speed improvement) and definitely save you some platform credits as well.
I'll keep this issue open - feel free to ask additional questions if you have any - or close this issue, if you don't. Cheers!
I am scraping this wordpress blog for events around my city and converting them into events with scheduling details and lat/long coordinates to display the events on a map. The async request I'm doing is to geocode the address for the event for the purposes of displaying on a map.
I will look into parsing the site with cheerio, but I am guessing I'd have to then use another actor to geocode the address as I would not be able to do this in cheerio, right?
Hello I figured out how to do it with cheerio (their documentation is horrible) and it works great! thanks
Actor Metrics
2.6k monthly users
-
220 stars
>99% runs succeeded
44 days response time
Created in Mar 2019
Modified 3 months ago