No credit card required

Puppeteer Scraper

apify/puppeteer-scraper

No credit card required

Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.

Do you want to learn more about this Actor?

Get a demo

Back to issues Create new issue

how i can exclude start url in request queue list

Closed

pizicai36 opened this issue

i set request queue name now all url save in request queue list how i can exclude start url in request queue list ？

Andrey Bykov (Andrey_Bykov)

Hey there! I don't quite understand what are you trying to achieve, could you please elaborate?

pizicai36

when i set request queue name, all url will save in request queue list, when i run new task, it show the url has been processed ,so i can't get the new url and new data all new url (detail page url) in start page , so how i can exclude start url in request queue list

Andrey Bykov (Andrey_Bykov)

I think for your use-case - just leave the request queue name empty. This way each run will use the default request queue and it will be empty at the beginning of each run.

pizicai36

now i leave the request queue name empty. but when i run again ,it shoiw error: All requests from the queue have been processed, the crawler will shut down i confirm have new url in the start page

Martin Adámek (adamek)

I see your latest runs are getting some results, did you find the problem yourself?

pizicai36

now is ok, thanks

554291 554291@qq.com

------------------ Original ------------------

Add comment

Developer

Apify

Actor metrics

344 monthly users
58 stars
99.7% runs succeeded
Created in Apr 2019
Modified 5 months ago

Categories

Developer tools

For creators

Web Scraper

apify/web-scraper

Crawls arbitrary websites using the Chrome browser and extracts data from pages using JavaScript code. The Actor supports both recursive crawling and lists of URLs and automatically manages concurrency for maximum performance. This is Apify's basic tool for web crawling and scraping.

Apify

70.5k

198

Playwright Scraper

apify/playwright-scraper

Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

Apify

812

Vanilla JS Scraper

mstephen190/vanilla-js-scraper

Scrape the web using familiar JavaScript methods! Crawls websites using raw HTTP requests, parses the HTML with the JSDOM package, and extracts data from the pages using Node.js code. Supports both recursive crawling and lists of URLs. This actor is a non jQuery alternative to CheerioScraper.

Matthias Stephens

417

Google SERP Scraper

quaking_pail/google-serp-scraper

Extracting detailed data from Google Search Engine Results Pages (SERPs) specific to search, including organic and paid listings, descriptions, URLs, and emphasized keywords. Ideal for digital marketers, SEO professionals, and market researchers. It will not scrap each site (easy and light run).

AI_Builder

Redfin Fast Scraper

mantisus/redfin-fast-scraper

Redfin: Scrape fast, stay light! Skip bloated browser tools. My Redfin scraper extracts property data in a flash, no heavy lifting is needed. Scrape/monitor listings with ease, all without Puppeteer or Playwright. ⚡️

Maksym Bohomolov

Thefork Fast Scraper

mantisus/thefork-fast-scraper

Scrape TheFork.com quickly and easily! Skip bloated browser tools. This scraper extracts restaurant data in a flash, no heavy lifting is needed. Scrape and monitor data with ease, all without Puppeteer or Playwright. ⚡️

Maksym Bohomolov

Cloudflare Web Scraper

dtrungtin/cloudflare-web-scraper

Prevents Puppeteer from being detected as a bot in services like Cloudflare and allows you to pass captchas without any problems

Tin Duong

Redfin Fast Scraper Per Results

mantisus/redfin-fast-scraper-per-results

Maksym Bohomolov

Thefork Fast Scraper Per Result

mantisus/thefork-fast-scraper-per-result

Maksym Bohomolov

Zoopla.co.uk Fast Scraper

mantisus/zoopla-actor

Zoopla.co.uk: Scrape fast, stay light! Skip bloated browser tools. My Zoopla scraper extracts property data in a flash, no heavy lifting is needed. Scrape/monitor listings with ease, all without Puppeteer or Playwright. ⚡️

Maksym Bohomolov