Cheerio Scraper avatar
Cheerio Scraper
Try for free

No credit card required

View all Actors
Cheerio Scraper

Cheerio Scraper

apify/cheerio-scraper
Try for free

No credit card required

Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.

User avatar

Add waiting time

Closed

raphael123 opened this issue
6 months ago

Hi there, would it be possible to add a loading time in seconds before crawling the page. there is some content on the site I'm trying to scrape that takes about 5s to load and it's not picking it up currently.

thanks!

User avatar

Hello @raphael123 and thank you for your interest in this Actor!

Unfortunately, the Cheerio Scraper will never load the dynamic content you are talking about, because it cannot execute client-side Javascript. Because of this, the Cheerio Scraper can be exceptionally fast... but on the other hand, it can only be used to crawl pages that don't utilize client-side rendering (eg. fetch some extra data after the initial page load via asynchronous requests).

Your best bet is to use one of our other actors - Web Scraper. It's very similar to this Cheerio Scraper, with one big difference - under the hood, the Web Scraper uses an actual instance of Google Chrome, and because of this, it can render the webpage as if you just opened it in your web browser. Using this Actor, you can wait for some time in the custom Page function to make sure all the content has been loaded.

Did this answer your question? I'll close this issue now, but feel free to ask any additional questions in case of any problems. Thanks!

Developer
Maintained by Apify
Actor metrics
  • 402 monthly users
  • 99.8% runs succeeded
  • 0.4 days response time
  • Created in Apr 2019
  • Modified about 1 month ago