Advanced Glassdoor Scraper
This Actor is unavailable because the developer has decided to deprecate it. Would you like to try a similar Actor instead?
See alternative ActorsAdvanced Glassdoor Scraper
The most advanced Glassdoor Scraper that you would ever need. Extract millions of companies, salaries, interviews, jobs, and reviews from Glassdoor. You can specify search terms, filters, list pages, and more! Extremely fast, with no limits. Super easy to use!
hello. the crawler is constantly getting 403
Some run IDs EYurJeeMnx7Qrbgco AQC9zu6coBvVkV8I9 XBFey5hQXoeILA1tz
I've noticed that it is a default for cloudflare to throw 403 based on headers. a curl with no headers gets instant 403
However this works: curl -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.70 Safari/537.36' -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3' -H 'accept-language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7' https://www.glassdoor.co.uk/index.htm
Could you please add headers of the type above, or add an option to pass headers to your crawler?
Hey there,
Thank you very much for reaching out, and letting us know about your issue. The actor already has the capability of including headers, rotating IPs, and changing the fingerprints. The problem with the actor was focusing on the .com
domain only and not supporting other Glassdoor domains. We just deployed a new version, and now the actor can easily handle the other Glassdoor domains without any problems. Please keep in mind that this actor requires qualified proxies which is the main problem with Glassdoor's security measures.
Best