
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
4.2 (40)
Pricing
Pay per usage
1398
Total users
54K
Monthly users
8K
Runs succeeded
>99%
Issues response
6.8 days
Last modified
5 days ago
Website Content Crawler stuck - cost keeps increasing
Opened 4 days ago by digtital_moose, last comment a day ago by Jan Buchar (janbuchar)
Http website inaccessible
Opened 5 days ago by souheil, last comment 3 hours ago by souheil
Avoid query parameters when crawling websites
Opened 7 days ago by innovum_admin, last comment 7 days ago by innovum_admin
Getting 403 from public page
Opened 8 days ago by formidable_quagmire, last comment 7 days ago by formidable_quagmire
2 failed crawled websites
Opened 10 days ago by bor.cerlini, last comment 10 days ago by Jiří Spilka (jiri.spilka)
crawling cannot be done with arabic website in english
Opened 12 days ago by aswinthazhath, last comment 8 days ago by Jindřich Bär (jindrich.bar)
Timeout and no data
Opened 17 days ago by Autocom, last comment 10 days ago by Jiří Spilka (jiri.spilka)
2025-05-13T10:00:05.221Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. page.evaluate: Execution context was destroyed, most likely because of a navigation.
Opened 21 days ago by formidable_quagmire, last comment 10 days ago by Jiří Spilka (jiri.spilka)
Error: Cannot run Actor (Network Error)
Opened 22 days ago by dawieharmse, last comment 22 days ago by dawieharmse
CORS Error
Opened 22 days ago by fmateen, last comment 20 days ago by fmateen
Is there a way to crawl URL from the visible HTML after removing "removeElementsCssSelector"
Opened 22 days ago by formidable_quagmire, last comment 5 hours ago by Jindřich Bär (jindrich.bar)
Cookie Banner is not removed
Opened a month ago by Joe11, last comment 20 days ago by Jakub Kopecký (jakub.kopecky)
Crawling is stuck (10h+)
Opened a month ago by jauns-ai, last comment a month ago by Jakub Kopecký (jakub.kopecky)
Crawling a small list of pdf urls hangs and crashes the crawler repetitively
Opened a month ago by uglyrobot, last comment a month ago by Jakub Kopecký (jakub.kopecky)
Execution context was destroyed
Opened 2 months ago by benjaminprevot, last comment 7 days ago by conv_ai_account
can we get the images on the pages too?
Opened 2 months ago by disarming_rutabaga, last comment a month ago by disarming_rutabaga-owner
Issue Crawling Content from Paid Websites Like New York Times
Opened 2 months ago by onlinereach, last comment 2 months ago by Jakub Kopecký (jakub.kopecky)
Adsterra .com
Opened 2 months ago by Tijjeboy, last comment 2 months ago by Jiří Spilka (jiri.spilka)
Large number of requests fail
Opened 2 months ago by cirez_d, last comment 10 days ago by Jiří Spilka (jiri.spilka)
Add Full File Name to the Key-Value-Stores
Opened 2 months ago by CtrlAltElite, last comment a month ago by Jakub Kopecký (jakub.kopecky)