Website Content Crawler avatar
Website Content Crawler

Pricing

Pay per usage

Go to Store
Website Content Crawler

Website Content Crawler

Developed by

Apify

Apify

Maintained by Apify

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

3.7 (41)

Pricing

Pay per usage

1526

Total users

59K

Monthly users

7.8K

Runs succeeded

>99%

Issues response

7.6 days

Last modified

3 days ago

DN

crawling take too much longer time

Closed

dnavadiya401 opened this issue
5 months ago

Recently, when crawling, the runs have been taking considerably longer (20-30 minutes). and again back to the apify then my limit is over.

jakub.kopecky avatar

Hi, thank you for using the Website Content Crawler.

The crawler might take longer due to network or website-related issues. Since you're crawling the entire site without limiting the results, this can significantly increase the processing time and exceed your limits, especially on the free tier. We recommend focusing on more specific URLs and limiting the number of results (please see "Crawler settings" -> "Max pages") to manage your usage more effectively. Please consider adjusting your crawl settings to target specific pages or reduce the scope of your crawl (please see "Crawler settings" -> "Include URLs" and "Exclude URLs").

Please try to run the Actor again with more specific crawler settings and let me know if you encounter any issues.

Jakub Kopecky