Web Scraper
No credit card required
Web Scraper
No credit card required
Crawls arbitrary websites using the Chrome browser and extracts data from pages using JavaScript code. The Actor supports both recursive crawling and lists of URLs and automatically manages concurrency for maximum performance. This is Apify's basic tool for web crawling and scraping.
Do you want to learn more about this Actor?
Get a demohttps://www.loom.com/share/0feaf14b3b18436ebb8752389381b63e
Issue Overview:
We encountered significant challenges while attempting to scrape multiple websites. Despite submitting a batch of around 40 URLs, only two were successfully processed, even though the system indicated success. The key issues are as follows:
-
Bulk Processing Limitation: The current setup does not support efficient bulk processing of websites. Handling a large input, such as thousands of URLs, is infeasible without creating individual tasks for each, which is both impractical and costly.
-
Error Tracking and Transparency: The system does not provide a way to map errors to specific URLs, making it difficult to identify and address issues for individual websites.
-
Processing Failures: Most of the submitted URLs were ignored, with no clear indication of why this occurred, despite there being no apparent limitations (e.g., task limits).
Resolution Needed: A more scalable solution is required to process bulk website inputs effectively and provide detailed feedback for each URL, including handling errors systematically.
Attaching input
Hello, and thank you for your interest in this Actor!
Note that you're setting the maxResultsPerCrawl
input to 1
. This means the Actor will stop after producing one result. You can also see this in the Actor log (User set limit of 1 results was reached. Finishing the crawl.
).
See my "fixed" run here, where I just set the maxResultsPerCrawl
option to 100
. Even though some pages are still missing (the servers were unreachable, even from my own computer), the Actor produces 46 results.
I'll close this issue now, but feel free to ask additional questions if you have any. Cheers!
Actor Metrics
2.5k monthly users
-
331 stars
>99% runs succeeded
37 days response time
Created in Mar 2019
Modified 5 months ago