
Cheerio Scraper
No credit card required

Cheerio Scraper
No credit card required
Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.
Various Bugs
Good Morning I found various bugs in the latest cheerio scraper, I spoke in chat with Tsveta, besides this the fact that you are obliging to use your proxy in a free account is a killer feature that will make user escape but anyway:
If I want to use one of my custom http proxy I get always error even if the proxy is working correctly. I have got error 400 from the proxy, it seems that apify is not making the proper calls.
Besides this the interface is blocking adding the socks5:// protocol even if in your documentation is stated so: https://apify.com/apify/cheerio-scraper#proxy-configuration

Hello, we are aware of this problem and looking into it now.
As for mandatory proxies, that will stay but we will provide Apify proxies for free users indefinitely soon
ollie10
Another bug I spotted is that globs are not shown once you reload the page after saving it, but they are saved correctly, anyway if you can't see them and you don't know how to use the json editor it's tricky for most users

Thanks, this is a one-time issue since the Glob input type was just changed. Will be fixed asap
ollie10
You're welcome, regards

Socks are not supported, we will fix the docs. As I said, very soon free accounts will be granted free proxies forever
ollie10
A little bit of a mess over there, sorry to say: removing features, documentation incorrect... me I'm an hobbist but how companies can build a reliable product with this kind of premises...
Many thanks so I will look forward for the developments

You are right. Most devs build new actors with Crawlee and Apify SDK directly so the doc issue escaped. As for changes, we are aware of the bad execution of these events, it is not a norm.
Actor Metrics
599 monthly users
-
120 bookmarks
>99% runs succeeded
36 days response time
Created in Apr 2019
Modified 4 months ago