Puppeteer Scraper avatar
Puppeteer Scraper
Try for free

No credit card required

View all Actors
Puppeteer Scraper

Puppeteer Scraper

apify/puppeteer-scraper
Try for free

No credit card required

Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.

User avatar

how i can exclude start url in request queue list

Closed

pizicai36 opened this issue
2 years ago

i set request queue name now all url save in request queue list how i can exclude start url in request queue list ?

User avatar

Hey there! I don't quite understand what are you trying to achieve, could you please elaborate?

User avatar

pizicai36

2 years ago

when i set request queue name, all url will save in request queue list, when i run new task, it show the url has been processed ,so i can't get the new url and new data all new url (detail page url) in start page , so how i can exclude start url in request queue list

User avatar

I think for your use-case - just leave the request queue name empty. This way each run will use the default request queue and it will be empty at the beginning of each run.

User avatar

pizicai36

a year ago

now i leave the request queue name empty. but when i run again ,it shoiw error: All requests from the queue have been processed, the crawler will shut down i confirm have new url in the start page

User avatar

I see your latest runs are getting some results, did you find the problem yourself?

User avatar

pizicai36

a year ago

now is ok, thanks

554291 554291@qq.com

 

------------------ Original ------------------

Developer
Maintained by Apify
Actor metrics
  • 245 monthly users
  • 99.8% runs succeeded
  • 13 days response time
  • Created in Apr 2019
  • Modified about 1 month ago