
Puppeteer Scraper
Pricing
Pay per usage

Puppeteer Scraper
Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.
5.0 (5)
Pricing
Pay per usage
116
Monthly users
597
Runs succeeded
>99%
Response time
30 days
Last modified
10 months ago
Ignore URLs with certain query strings
The site i want to scrape has it in multiple languages all denoted with the query '?=hl'. Can i get the crawler to ignore these?

By ignoring it you mean you want to skip enqueuing such URLs if they were already processed? Are you sure the URLs are otherwise the same? If so, you could trim the query parameter manually and provide uniqueKey
explicitly when adding new requests to the queue (which you would need to do manually, inside your page function, while disabling the automatic enqueueing, e.g. by setting the selector option to some gibberish).
Alternatively, you could skip the pages inside the request handler.
Pricing
Pricing model
Pay per usageThis Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.