Similarweb Scraper avatar
Similarweb Scraper

Pricing

Pay per event

Go to Store
Similarweb Scraper

Similarweb Scraper

Developed by

Tri⟁angle

Tri⟁angle

Maintained by Apify

A simple but powerful scraper for similarweb.com. Retrieve website popularity information and get it in a JSON/XML/CSV/Excel/HTML table format. Get data such as total visits, traffic sources, competitors, top countries, company info, etc..

4.6 (7)

Pricing

Pay per event

81

Total users

3.2K

Monthly users

190

Runs succeeded

>99%

Issues response

2.5 days

Last modified

22 days ago

RI

Crawling retries the valid url more than 10 times and gives no result in 20 min

Open

rivalout opened this issue
3 days ago

2025-07-14T14:09:55.360Z ACTOR: Pulling Docker image of build JYiCBoRQ3RS1kpx41 from registry. 2025-07-14T14:09:55.362Z ACTOR: Creating Docker container. 2025-07-14T14:09:55.405Z ACTOR: Starting Docker container. 2025-07-14T14:09:55.574Z Will run command: xvfb-run -a -s "-ac -screen 0 1920x1080x24+32 -nolisten tcp" /bin/sh -c ./start_xvfb_and_run_cmd.sh && npm run start:prod --silent 2025-07-14T14:09:57.911Z INFO System info {"apifyVersion":"3.4.2","apifyClientVersion":"2.12.5","crawleeVersion":"3.13.7","osType":"Linux","nodeVersion":"v20.19.2"} 2025-07-14T14:09:58.574Z INFO Starting the crawl. 2025-07-14T14:09:58.787Z INFO PlaywrightCrawler: Starting the crawler. 2025-07-14T14:10:08.666Z INFO Waiting for script... 2025-07-14T14:10:38.990Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Waiting for page load timed out. Retrying... (debug key: WAIT_FOR_PAGE_LOAD-1752502238666) 2025-07-14T14:10:38.992Z at file:///home/myuser/dist/utils.js:23:19 {"id":"mHZzAeWPIS3slfy","url":"https://similarweb.com/website/readdy.ai","retryCount":1} 2025-07-14T14:10:58.788Z INFO PlaywrightCrawler:Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":60538,"retryHistogram":[]} 2025-07-14T14:10:58.852Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":1,"desiredConcurrency":3,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0.019},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0.172},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}} 2025-07-14T14:11:10.942Z INFO Waiting for script... 2025-07-14T14:11:41.338Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Waiting for page load timed out. Retrying... (debug key: WAIT_FOR_PAGE_LOAD-1752502300943) 2025-07-14T14:11:41.340Z at file:///home/myuser/dist/utils.js:23:19 {"id":"mHZzAeWPIS3slfy","url":"https://similarweb.com/website/readdy.ai","retryCount":2} 2025-07-14T14:11:52.797Z INFO Waiting for script... 2025-07-14T14:11:58.787Z INFO PlaywrightCrawler:Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":120538,"retryHistogram":[]} 2025-07-14T14:11:58.853Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":1,"desiredConcurrency":3,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0.172},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}} 2025-07-14T14:12:23.804Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Waiting for page load timed out. Retrying... (debug key: WAIT_FOR_PAGE_LOAD-1752502342797) 2025-07-14T14:12:23.806Z at file:///home/myuser/dist/utils.js:23:19 {"id":"mHZzAeWPIS3slfy","url":"https://similarweb.com/website/readdy.ai","retryCount":3} 2025-07-14T14:12:58.787Z INFO PlaywrightCrawler:Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":180538,"retryHistogram":[]} 2025-07-14T14:12:58.856Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":1,"desiredConcurrency":3,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}} 2025-07-14T14:13:11.343Z INFO Waiting for script... 2025-07-14T14:13:41.528Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Waiting for page load timed out. Retrying... (debug key: WAIT_FOR_PAGE_LOAD-1752502421343) 2025-07-14T14:13:41.530Z at file:///home/myuser/dist/utils.js:23:19 {"id":"mHZzAeWPIS3slfy","url":"https://similarweb.com/website/readdy.ai","retryCount":4} 2025-07-14T14:13:58.788Z INFO PlaywrightCrawler:Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":240538,"retryHistogram":[]} 2025-07-14T14:13:58.860Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":1,"desiredConcurrency":3,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0.019},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0.17},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}} 2025-07-14T14:14:10.976Z INFO Waiting for script... 2025-07-14T14:14:41.170Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Waiting for page load timed out. Retrying... (debug key: WAIT_FOR_PAGE_LOAD-1752502480977) 2025-07-14T14:14:41.172Z at file:///home/myuser/dist/utils.js:23:19 {"id":"mHZzAeWPIS3slfy","url":"https://similarweb.com/website/readdy.ai","retryCount":5} 2025-07-14T14:14:58.788Z INFO PlaywrightCrawler:Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":300538,"retryHistogram":[]} 2025-07-14T14:14:58.862Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":1,"desiredConcurrency":3,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0.139},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}} 2025-07-14T14:15:21.049Z INFO Waiting for script...