
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
4.0 (40)
Pricing
Pay per usage
1391
Total users
53K
Monthly users
7.9K
Runs succeeded
>99%
Issues response
6.8 days
Last modified
4 days ago
2 failed crawled websites
Open
I was crawling 5 different URLs and the whole process took almost 7 minutes. 2 of them failed, the last one I aborted because it was just takint too long (would probably fail as well).

Hi, thank you for using the Website Content Crawler.
There are actually two separate issues:
1. Reddit Scraping Reddit is extremely challenging due to its dynamic structure and aggressive anti-bot measures. The Website Content Crawler is not optimized for Reddit — we recommend using dedicated Actors specifically designed for that platform.
2. https://www.rei.com/
For REI, the crawler attempted to click and expand elements matching the selector "[aria-expanded=\"false\"]"
. However, because many such elements existed — some of which weren’t actually expandable — this caused the crawler to fail.
I’ve adjusted the configuration to prevent unnecessary clicks by changing the selector to "[aria-expanded=\"true\"]"
. You can see the successful run here — the content was scraped in just 36 seconds.
I hope this helps! Best regards, Jiri