Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

Go to Store
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

DN

crawling take too much longer time

Open
dnavadiya401 opened this issue
7 days ago

Recently, when crawling, the runs have been taking considerably longer (20-30 minutes). and again back to the apify then my limit is over.

jakub.kopecky avatar

Hi, thank you for using the Website Content Crawler.

The crawler might take longer due to network or website-related issues. Since you're crawling the entire site without limiting the results, this can significantly increase the processing time and exceed your limits, especially on the free tier. We recommend focusing on more specific URLs and limiting the number of results (please see "Crawler settings" -> "Max pages") to manage your usage more effectively. Please consider adjusting your crawl settings to target specific pages or reduce the scope of your crawl (please see "Crawler settings" -> "Include URLs" and "Exclude URLs").

Please try to run the Actor again with more specific crawler settings and let me know if you encounter any issues.

Jakub Kopecky

Developer
Maintained by Apify

Actor Metrics

  • 5.4k monthly users

  • 990 bookmarks

  • >99% runs succeeded

  • 1 days response time

  • Created in Mar 2023

  • Modified 13 days ago