Cheerio Scraper avatar
Cheerio Scraper
Try for free

No credit card required

View all Actors
Cheerio Scraper

Cheerio Scraper

apify/cheerio-scraper
Try for free

No credit card required

Crawls websites using raw HTTP requests, parses the HTML with the Cheerio library, and extracts data from the pages using a Node.js code. Supports both recursive crawling and lists of URLs. This actor is a high-performance alternative to apify/web-scraper for websites that do not require JavaScript.

User avatar

Various Bugs

Closed

ollie10 opened this issue
a year ago

Good Morning I found various bugs in the latest cheerio scraper, I spoke in chat with Tsveta, besides this the fact that you are obliging to use your proxy in a free account is a killer feature that will make user escape but anyway:

If I want to use one of my custom http proxy I get always error even if the proxy is working correctly. I have got error 400 from the proxy, it seems that apify is not making the proper calls.

Besides this the interface is blocking adding the socks5:// protocol even if in your documentation is stated so: https://apify.com/apify/cheerio-scraper#proxy-configuration

User avatar

Hello, we are aware of this problem and looking into it now.

As for mandatory proxies, that will stay but we will provide Apify proxies for free users indefinitely soon

User avatar

ollie10

a year ago

Another bug I spotted is that globs are not shown once you reload the page after saving it, but they are saved correctly, anyway if you can't see them and you don't know how to use the json editor it's tricky for most users

User avatar

Thanks, this is a one-time issue since the Glob input type was just changed. Will be fixed asap

User avatar

ollie10

a year ago

You're welcome, regards

User avatar

Socks are not supported, we will fix the docs. As I said, very soon free accounts will be granted free proxies forever

User avatar

ollie10

a year ago

A little bit of a mess over there, sorry to say: removing features, documentation incorrect... me I'm an hobbist but how companies can build a reliable product with this kind of premises...

Many thanks so I will look forward for the developments

User avatar

You are right. Most devs build new actors with Crawlee and Apify SDK directly so the doc issue escaped. As for changes, we are aware of the bad execution of these events, it is not a norm.

Developer
Maintained by Apify
Actor metrics
  • 400 monthly users
  • 99.8% runs succeeded
  • 0.4 days response time
  • Created in Apr 2019
  • Modified about 1 month ago