
Puppeteer Scraper
Pricing
Pay per usage

Puppeteer Scraper
Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.
5.0 (5)
Pricing
Pay per usage
191
Total users
8.5K
Monthly users
988
Runs succeeded
>99%
Issues response
42 days
Last modified
2 months ago
Puppeteer Scraper Actor Not Executing Requests (Despite Valid Start URL & Page Function)
Closed
Hi team, I’m encountering a persistent issue with the Apify Puppeteer Scraper actor where no requests are being successfully executed, despite a valid startUrls input and a working pageFunction.
Here’s a summary of what I’ve tried: • I’m entering my URL manually under startUrls, e.g.: https://www.glassdoor.com/Reviews/Google-Reviews-E9079.htm • The pageFunction is simple and valid (e.g., page.title()). • I’ve also attempted to configure headers and user-agent via preNavigationHooks. • Despite this, I either get 0 requests processed, or in some cases, 403 Forbidden errors (even when trying with Apify Proxy). • I already have login logic and key-value store in place but I’m unsure if this is being properly connected during execution. • There’s no indication that the crawler is navigating or queuing additional URLs, even though the actor says it’s configured correctly.
Here’s the log excerpt from my most recent run:
requestsFinished: 0 requestsFailed: 1 403 status code
I’m not sure if it’s: • A problem with the actor’s config parsing (e.g., from JSON input) • A bug with pre-navigation hooks • Or if Glassdoor is aggressively blocking even with proxies and login steps.
Could someone on the team help debug this setup, or confirm what I might be missing?
Hello, and thank you for the detailed report!
A 403
status code generally means the target website is actively blocking the request — even if you're using proxies or have login logic in place. Glassdoor is known to have strong anti-bot protection.
The Puppeteer Scraper Actor might simply not be strong enough in its stealth and anti-bot capabilities to reliably access Glassdoor. In cases like this, a third-party solution may work better.
We recommend checking out some ready-made solutions in the Apify Store that are already designed for Glassdoor and similar sites (link to store).
These may include tested workarounds like proper headers, session handling, or stealth browser behavior.
Since this is not an issue with the Actor itself, we’ll go ahead and close this ticket But feel free to open a new one if you need help with another setup!