
Web Scraper
Pricing
Pay per usage

Web Scraper
Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.
4.4 (23)
Pricing
Pay per usage
926
Total users
91K
Monthly users
5K
Runs succeeded
>99%
Issues response
7.8 days
Last modified
2 months ago
Crawling not working well
Closed
I attempted to crawl the website https://jcyared.com, setting the maximum number of pages per crawl (maxPagesPerCrawl parameter) to 20. However, I only managed to retrieve 2 pages. Could someone explain why this might have occurred?

there are two problems in your input:
- you set
maxCrawlingDepth
to 0, which means nothing nested will be enqueued - you set the globs to
https://jcyared.com/*
which means no nesting as well (as this acceptshttps://jcyared.com/foo
but nothttps://jcyared.com/foo/bar
), you wanthttps://jcyared.com/**
to allow multiple slashes in the URL path
The second one is the important bit. Here is a run with those two fixed, which seems to work as expected (I've aborted it after a few minutes but it went through more than 40 pages already):