No credit card required
Puppeteer Scraper
No credit card required
Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.
i set request queue name now all url save in request queue list how i can exclude start url in request queue list ?
Hey there! I don't quite understand what are you trying to achieve, could you please elaborate?
when i set request queue name, all url will save in request queue list, when i run new task, it show the url has been processed ,so i can't get the new url and new data all new url (detail page url) in start page , so how i can exclude start url in request queue list
I think for your use-case - just leave the request queue name empty. This way each run will use the default request queue and it will be empty at the beginning of each run.
now i leave the request queue name empty. but when i run again ,it shoiw error: All requests from the queue have been processed, the crawler will shut down i confirm have new url in the start page
I see your latest runs are getting some results, did you find the problem yourself?
- 287 monthly users
- 99.8% runs succeeded
- 15 days response time
- Created in Apr 2019
- Modified about 1 month ago