Similarweb Scraper avatar

Similarweb Scraper

Try for free

No credit card required

View all Actors
Similarweb Scraper

Similarweb Scraper

tri_angle/similarweb-scraper
Try for free

No credit card required

A simple but powerful scraper for similarweb.com. Retrieve website popularity information and get it in a JSON/XML/CSV/Excel/HTML table format. Get data such as total visits, traffic sources, competitors, top countries, company info, etc..

Do you want to learn more about this Actor?

Get a demo
GG

Crawling erros

Closed

gainful_governor opened this issue
a month ago

Hi there, I've raised this previously and was told that this its not a bug, but I can't see how this can't be deemed a bug when it fails this many times in a row and runs for 15+ minutes (when it usually takes ~ 1 minute

tri_angle avatar

Hi there, if you check the log of this run, you can see the actor is facing captchas. What you can do is to decrease the number of retries in the input, the default is 10 retries.

GG

gainful_governor

a month ago

thanks for your reply .... what is the best strategy to get around the captcha? would repeated runs ever solve this? is the captcha based on the fingerprint of the current IP + container?

tri_angle avatar

You can start another run with residential proxy, this works well (at least now). Here is more info about dealing with anti-scraping protection, if you would consider building your own actors in the future https://docs.apify.com/academy/anti-scraping and our discord community: https://discord.com/invite/jyEM2PRvMU

GG

gainful_governor

a month ago

thanks for your help :-) will try residential proxy

Developer
Maintained by Apify
Actor metrics
  • 122 monthly users
  • 24 stars
  • 99.9% runs succeeded
  • 6 hours response time
  • Created in May 2022
  • Modified 2 days ago