
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
4.0 (41)
Pricing
Pay per usage
1593
Total users
62K
Monthly users
8.2K
Runs succeeded
>99%
Issues response
7.9 days
Last modified
15 hours ago
Crawler does not work any longer... I tried with multiple links about 30 minutes ago and it was working and it randomly stopped!!
Closed
The crawler was working perfectly and it randomly stopped scraping!

Hi, thank you for using the Website Content Crawler.
I checked your Actor run and did not find any issues in the logs. Sometimes the Website Content Crawler can run longer than expected due to network or website-related issues. I also noticed that you are not limiting the maximum number of results in the Actor input, which can cause the Actor to run for a long time until it times out if there are many pages to crawl. Try setting Crawler settings -> Max pages
(or maxCrawlPages
in JSON) to a more reasonable number.
I tried to run the crawler with your input, but limited the maximum number of results, and it finished successfully: https://console.apify.com/view/runs/Rg9KfyeCxZPucqUdV
Please try to run the Actor again and let me know if you encounter any issues.
Thank you, Jakub