
Glassdoor Scraper (Pay Per Result)
Pay $5.00 for 1,000 results

Glassdoor Scraper (Pay Per Result)
Pay $5.00 for 1,000 results
Extract comprehensive employee reviews from Glassdoor effortlessly. Gather ratings, comments, and company responses to power your HR insights and market research.
Too few results
I am getting much fewer results that I had anticipated. The entity that I am trying to get data for has 1,000 pages under pagination. What is the best way to access all results? I tried increasing the max results to 5000, but that only got me 160. I tried putting in all direct links from pagination, but that errored out. Looking for some guidance here. Thanks!

hey mate, send the run URL so I can take a look at your 160 results, I'll see what could be an issue, Do you have a paid subscription?
arham.choudhury
I have a paid subscription to Apify. I have also subscribed to the the $10/month "memo23/apify-glassdoor-reviews-scraper" product. This specific issue refers to the pay per use version of your actor. But, I am having trouble getting large volume of data from both your monthly rental and pay-per-use versions of the actor.
Here is a link to the task. https://console.apify.com/actors/tasks/4uDtdPLRQ2ImWRd7j/

hey mate, thank you for your support and using my actors. Don't send me the task, send me 'run url', go to your runs and click on the run that is 'problematic' and then copy the URL of that run...
arham.choudhury
Is this the run url?: https://console.apify.com/actors/runs/blrie05VyTNErfmnP#log

yes, I see with this run you had more luck: https://console.apify.com/actors/runs/RXpu6WxTH5NECDvK6#log just resurrect it maybe? I would say the key part here is the proxies, if they fail you get 403 and that is, I'll see if I can improve this

I pushed some changes, put big number for 'maxItems' and let see... :)
arham.choudhury
Thank you. I am going to give it another try tomorrow.
arham.choudhury
I tried another run. This time I asked for 7,000 rows of data. The actor timed out after ~5,800 rows.
Here is a link to that run: https://console.apify.com/view/runs/01ATRmh4bn0ew4ZY2

Yes, it timed out after one hour which is default max time of the actor, you can change that, but in this case after one hour you can continue the run, just click on the new button ‘resurrect’ and it will continue to scrape where it stopped before… 😊
arham.choudhury
Just to be clear... 'resurrect' will pick up where it left off? It wont start from the beginning? I dont want duplicate data.

Yes, it will continue to where it stopped 😊

I was thinking to add 'logic' to this scraper so basically skips the items that it already scraped, let me know if you would be interested in this? Also if you have other requests or projects please let me know...

also how was the run, you got everything?
arham.choudhury
I was able to get the 7,000 records that I asked for. So that's great! However, I do see a few duplicates (about 90 of them). So, I'll need to do some cleaning on the data. I think this will be workable. Thanks.

great news my friend :)
Actor Metrics
25 monthly users
-
2 bookmarks
>99% runs succeeded
1.2 hours response time
Created in Dec 2024
Modified 2 days ago