Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

Go to Store
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

issue in one run

Opened 2 days ago by The25th, last comment 2 days ago by The25th

Actor process was aborted

Opened 2 days ago by ultimateai-dev, last comment 2 days ago by ultimateai-dev

crawling take too much longer time

Opened 4 days ago by dnavadiya401, last comment 2 days ago by Jakub Kopecký (jakub.kopecky)

Key listing details were ignored.

Opened 5 days ago by YanivPRZ, last comment 4 days ago by Jakub Kopecký (jakub.kopecky)

Error "Execution context was destroyed"

Opened 8 days ago by iadvize, last comment 6 days ago by Jakub Kopecký (jakub.kopecky)

Crawler does not work any longer... I tried with multiple links about 30 minutes ago and it was working and it randomly stopped!!

Opened 9 days ago by Capture_Marketing, last comment 8 days ago by Jakub Kopecký (jakub.kopecky)

scraper don't scrape all the website content like product description

Opened 10 days ago by maabada.shivok, last comment 10 days ago by maabada.shivok

Passwords Set

Opened 11 days ago by entertained_geode, last comment 10 days ago by Jiří Spilka (jiri.spilka)

Limit actur run to just exact input URL

Opened 17 days ago by thom_vd_donk, last comment 17 days ago by Jiří Spilka (jiri.spilka)

Crawl hung at finished

Opened 24 days ago by mcantrell, last comment 17 days ago by mykola_scrapes

Crawler timeout setting doesn't work

Opened 24 days ago by bill007, last comment 19 days ago by Jakub Drobník (drobnikj)

Remove Browser depreciation warnings

Opened 25 days ago by cirez_d, last comment 21 days ago by Jiří Spilka (jiri.spilka)

Trying to scrape OpenAI documentation

Opened a month ago by Digital_Mole, last comment a month ago by Jiří Spilka (jiri.spilka)

Request for Assistance: Actor Timeout Issue & Custom API Output

Opened a month ago by RaulDC, last comment 19 days ago by Jakub Drobník (drobnikj)

Integration for watch actor runs with make.com not working

Opened a month ago by smart.lean.ideas, last comment a month ago by Jakub Drobník (drobnikj)

Decode non-UTF-8 text in crawlerType cheerio

Opened a year ago by consoling_knock, last comment a year ago by Jindřich Bär (jindrich.bar)

Developer
Maintained by Apify

Actor Metrics

  • 5.2k monthly users

  • 933 stars

  • >99% runs succeeded

  • 1 days response time

  • Created in Mar 2023

  • Modified 9 days ago