Puppeteer Scraper
No credit card required
Puppeteer Scraper
No credit card required
Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.
Do you want to learn more about this Actor?
Get a demoi set request queue name now all url save in request queue list how i can exclude start url in request queue list ?
Hey there! I don't quite understand what are you trying to achieve, could you please elaborate?
when i set request queue name, all url will save in request queue list, when i run new task, it show the url has been processed ,so i can't get the new url and new data all new url (detail page url) in start page , so how i can exclude start url in request queue list
I think for your use-case - just leave the request queue name empty. This way each run will use the default request queue and it will be empty at the beginning of each run.
now i leave the request queue name empty. but when i run again ,it shoiw error: All requests from the queue have been processed, the crawler will shut down i confirm have new url in the start page
I see your latest runs are getting some results, did you find the problem yourself?
- 344 monthly users
- 58 stars
- 99.7% runs succeeded
- Created in Apr 2019
- Modified 5 months ago