
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
4.6 (38)
Pricing
Pay per usage
1.1k
Monthly users
6k
Runs succeeded
>99%
Response time
2.3 days
Last modified
7 days ago
This crawler took too much time and too much compute power
This crawler ran for more than 12hrs. It probaly went into a loop and did not stop for hours of running until the billing limit was reached

Hi, thank you for using Website Content Crawler.
From the Actor run logs, there is no indication that the crawler went into a loop - it skipped all pages that it had already crawled. It seems that the website you were trying to crawl is too large and contains too many pages.
This can cause crawling to take a long time. To prevent this, we recommend setting the maxPages
parameter or specifying the Actor run timeout option to prevent this issue. You can even exclude some pages from being crawled by setting excludeUrlGlobs
.
Sorry that this happened to you. Please set the limits next time or keep an eye on the Actor run.
Jakub
Pricing
Pricing model
Pay per usageThis Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.