Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

Go to Store
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Crawler doesn't follow any links in Start URL

Opened 2 days ago by nhanna, last comment 11 hours ago by Jakub Kopecký (jakub.kopecky)

issue in one run

Opened 6 days ago by The25th, last comment 6 days ago by The25th

Actor process was aborted

Opened 6 days ago by ultimateai-dev, last comment 2 days ago by Jakub Kopecký (jakub.kopecky)

crawling take too much longer time

Opened 7 days ago by dnavadiya401, last comment 6 days ago by Jakub Kopecký (jakub.kopecky)

Key listing details were ignored.

Opened 9 days ago by YanivPRZ, last comment 8 days ago by Jakub Kopecký (jakub.kopecky)

No data saved

Opened 9 days ago by entertained_geode, last comment 9 days ago by entertained_geode

Error "Execution context was destroyed"

Opened 12 days ago by iadvize, last comment 10 days ago by Jakub Kopecký (jakub.kopecky)

Crawler does not work any longer... I tried with multiple links about 30 minutes ago and it was working and it randomly stopped!!

Opened 12 days ago by Capture_Marketing, last comment 12 days ago by Jakub Kopecký (jakub.kopecky)

scraper don't scrape all the website content like product description

Opened 14 days ago by maabada.shivok, last comment 14 days ago by maabada.shivok

Passwords Set

Opened 14 days ago by entertained_geode, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Limit actur run to just exact input URL

Opened 21 days ago by thom_vd_donk, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Bug in Run?

Opened 22 days ago by brave_easel, last comment 15 days ago by Jiří Spilka (jiri.spilka)

Trying to scrape the target URL for an "Apply" button

Opened 24 days ago by ctcoach, last comment 23 days ago by Jakub Kopecký (jakub.kopecky)

It crawls only the bottom half of the page.

Opened a month ago by dkampien, last comment 24 days ago by Jiří Spilka (jiri.spilka)

Crawl hung at finished

Opened a month ago by mcantrell, last comment 21 days ago by mykola_scrapes

Crawler timeout setting doesn't work

Opened a month ago by bill007, last comment 2 days ago by Jiří Spilka (jiri.spilka)

crawling takes longer

Opened a month ago by yener.yasin030, last comment a month ago by Jiří Spilka (jiri.spilka)

Remove Browser depreciation warnings

Opened a month ago by cirez_d, last comment 25 days ago by Jiří Spilka (jiri.spilka)

Trying to scrape OpenAI documentation

Opened a month ago by Digital_Mole, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Unable to retrieve key-value stores via API

Opened a month ago by shawnveltman, last comment a month ago by Jiří Spilka (jiri.spilka)

Developer
Maintained by Apify

Actor Metrics

  • 5.4k monthly users

  • 990 bookmarks

  • >99% runs succeeded

  • 1 days response time

  • Created in Mar 2023

  • Modified 13 days ago