Puppeteer Scraper avatar

Puppeteer Scraper

Try for free

No credit card required

Go to Store
Puppeteer Scraper

Puppeteer Scraper

apify/puppeteer-scraper
Try for free

No credit card required

Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.

PI

how i can exclude start url in request queue list

Closed
pizicai36 opened this issue
2 years ago

i set request queue name now all url save in request queue list how i can exclude start url in request queue list ?

Andrey_Bykov avatar

Hey there! I don't quite understand what are you trying to achieve, could you please elaborate?

PI

pizicai36

2 years ago

when i set request queue name, all url will save in request queue list, when i run new task, it show the url has been processed ,so i can't get the new url and new data all new url (detail page url) in start page , so how i can exclude start url in request queue list

Andrey_Bykov avatar

I think for your use-case - just leave the request queue name empty. This way each run will use the default request queue and it will be empty at the beginning of each run.

PI

pizicai36

2 years ago

now i leave the request queue name empty. but when i run again ,it shoiw error: All requests from the queue have been processed, the crawler will shut down i confirm have new url in the start page

adamek avatar

I see your latest runs are getting some results, did you find the problem yourself?

PI

pizicai36

2 years ago

now is ok, thanks

554291 554291@qq.com

 

------------------ Original ------------------

Developer
Maintained by Apify

Actor Metrics

  • 446 monthly users

  • 0 No bookmarks yet

  • >99% runs succeeded

  • Created in Apr 2019

  • Modified 8 months ago