Website Content Crawler
No credit card required
Website Content Crawler
No credit card required
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
Do you want to learn more about this Actor?
Get a demoI am using a web content crawler API, and I want the crawler to run every time the website (blog) I am crawling adds new content or edits existing content. Listening to blog changes
Hello, and thank you for your interest in Website Content Crawler! Basically, I see two options here. You could set up a Schedule for your crawling task and write some script to compare every new result with the previous one. Or you could use the content-checker actor, have it trigger a webhook when it finishes, and in the webhook you could inspect the result and then optionally call Website Content Crawler.
Actor Metrics
3.9k monthly users
-
709 stars
>99% runs succeeded
2.1 days response time
Created in Mar 2023
Modified 17 days ago