
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
3.9 (41)
Pricing
Pay per usage
1546
Total users
60K
Monthly users
7.8K
Runs succeeded
>99%
Issues response
7.9 days
Last modified
4 days ago
Include `Last-Modified`
Closed
We'd like to use Last-Modified
from the response header. Can you please include it in the output?

Hi, thanks for reaching out, we'll look into it and let you know.
Hello and thank you for your interest in this Actor!
We've just released a new WCC version (v0.3.22
), which includes the response HTTP headers in the result records (under metadata.headers
). Note that this stores the HTTP/2 pseudoheaders too, when applicable.
Thank you for the idea (and your patience)! Cheers!