Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

View all Actors
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Do you want to learn more about this Actor?

Get a demo
Crawler does not extract content, with no useful logs to debug

Opened 6 days ago by nimble_caretaker, last comment 5 days ago by nimble_caretaker

Crawler overcharges by several times

Opened 10 days ago by hyperlace, last comment 10 days ago by Jiří Spilka (jiri.spilka)

12 mins in and nothing has been crawled

Opened 12 days ago by callumdownie, last comment 5 days ago by Jiří Spilka (jiri.spilka)

crawling stuck

Opened 12 days ago by greenforestpath8, last comment 12 days ago by Jindřich Bär (jindrich.bar)

Too complex to start this process

Opened 14 days ago by rongwroom, last comment 13 days ago by Jiří Spilka (jiri.spilka)

Custom user agent

Opened 14 days ago by civic-roundtable, last comment 12 days ago by Jiří Spilka (jiri.spilka)

I can't send data from Clay to Apify.

Opened 19 days ago by romeoman, last comment 10 days ago by Jiří Spilka (jiri.spilka)

Access blocked everytime

Opened 23 days ago by saidur297, last comment 18 days ago by Jiří Spilka (jiri.spilka)

Various questions about operation and optimization of website content crawler

Opened 24 days ago by David Haddad (davhad), last comment 17 days ago by Jiří Spilka (jiri.spilka)

Actor simply doesn't work

Opened 25 days ago by xylonic_gloves, last comment 22 days ago by Jiří Spilka (jiri.spilka)

Insane usage time

Opened 25 days ago by xylonic_gloves, last comment 25 days ago by Jiří Spilka (jiri.spilka)

Doesn't scape table on website

Opened a month ago by esouthwick, last comment a month ago by Jindřich Bär (jindrich.bar)

Sitemap discovery takes long time (15 minutes)

Opened a month ago by Jiří Spilka (jiri.spilka), last comment 11 days ago by Jindřich Bär (jindrich.bar)

Website Scrape doesn't include Social Media Links

Opened a month ago by Synergize_AI, last comment a month ago by Jindřich Bär (jindrich.bar)

How grab all the urls from https://www.ung.no/oss/

Opened a month ago by cgoul, last comment 25 days ago by cgoul

Crawling not returning the text from pages

Opened a month ago by cgoul, last comment a month ago by cgoul

Didn't time out like it's supposed to. :)

Opened a month ago by drippingfist, last comment a month ago by Jindřich Bär (jindrich.bar)

Would not stop

Opened a month ago by jbone209, last comment a month ago by jbone209

The crawler hung at 100% and I had to kill it

Opened a month ago by pumpkin_protractor, last comment a month ago by Oscar Rodriguez (Oscardz)

faild!!!!!!!!!!!!!!!!

Opened a month ago by malachite_malachite, last comment a month ago by Jindřich Bär (jindrich.bar)

Developer
Maintained by Apify
Actor metrics
  • 3.8k monthly users
  • 635 stars
  • 100.0% runs succeeded
  • 2.7 days response time
  • Created in Mar 2023
  • Modified 7 days ago