
Website Content Crawler
Pricing
Pay per usage

Website Content Crawler
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
3.7 (41)
Pricing
Pay per usage
1499
Total users
58K
Monthly users
8.1K
Runs succeeded
>99%
Issues response
7.6 days
Last modified
25 minutes ago
crawler got hung up
Opened 13 hours ago by Tmoney97, last comment 13 hours ago by Tmoney97
Falta de Aviso
Opened 3 days ago by impeccable_niche, last comment 3 days ago by impeccable_niche
Glob Patterns are ignored when using Sitemap
Opened 7 days ago by cirez_d, last comment 6 days ago by Jindřich Bär (jindrich.bar)
Memory issue
Opened 9 days ago by acarter, last comment 9 days ago by Jindřich Bär (jindrich.bar)
Website Content Crawler stuck - cost keeps increasing
Opened a month ago by digtital_moose, last comment 12 days ago by jfnrj2ui
Avoid query parameters when crawling websites
Opened a month ago by innovum_admin, last comment 21 days ago by Jindřich Bär (jindrich.bar)
Getting 403 from public page
Opened a month ago by formidable_quagmire, last comment 21 days ago by formidable_quagmire
crawling cannot be done with arabic website in english
Opened a month ago by aswinthazhath, last comment a month ago by Jindřich Bär (jindrich.bar)
Timeout and no data
Opened a month ago by Autocom, last comment a month ago by Jiří Spilka (jiri.spilka)
2025-05-13T10:00:05.221Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. page.evaluate: Execution context was destroyed, most likely because of a navigation.
Opened a month ago by formidable_quagmire, last comment a month ago by Jiří Spilka (jiri.spilka)
CORS Error
Opened a month ago by fmateen, last comment 21 days ago by Jindřich Bär (jindrich.bar)
Is there a way to crawl URL from the visible HTML after removing "removeElementsCssSelector"
Opened a month ago by formidable_quagmire, last comment 22 days ago by Jindřich Bär (jindrich.bar)
Crawling is stuck (10h+)
Opened 2 months ago by jauns-ai, last comment 2 months ago by Jakub Kopecký (jakub.kopecky)
can we get the images on the pages too?
Opened 2 months ago by disarming_rutabaga, last comment 21 days ago by Jiří Spilka (jiri.spilka)
Issue Crawling Content from Paid Websites Like New York Times
Opened 2 months ago by onlinereach, last comment 2 months ago by Jakub Kopecký (jakub.kopecky)
Adsterra .com
Opened 2 months ago by Tijjeboy, last comment 2 months ago by Jiří Spilka (jiri.spilka)
Add Full File Name to the Key-Value-Stores
Opened 3 months ago by CtrlAltElite, last comment 2 months ago by Jakub Kopecký (jakub.kopecky)
scraper don't scrape all the website content like product description
Opened 5 months ago by maabada.shivok, last comment 5 months ago by maabada.shivok
Crawl hung at finished
Opened 5 months ago by mcantrell, last comment 5 months ago by mykola_scrapes
Decode non-UTF-8 text in crawlerType cheerio
Opened a year ago by consoling_knock, last comment a year ago by Jindřich Bär (jindrich.bar)