Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

Go to Store
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

No data saved

Opened 9 days ago by entertained_geode, last comment 9 days ago by entertained_geode

Passwords Set

Opened 14 days ago by entertained_geode, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Limit actur run to just exact input URL

Opened 21 days ago by thom_vd_donk, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Bug in Run?

Opened 22 days ago by brave_easel, last comment 15 days ago by Jiří Spilka (jiri.spilka)

Trying to scrape the target URL for an "Apply" button

Opened 24 days ago by ctcoach, last comment 23 days ago by Jakub Kopecký (jakub.kopecky)

It crawls only the bottom half of the page.

Opened a month ago by dkampien, last comment 24 days ago by Jiří Spilka (jiri.spilka)

Crawler timeout setting doesn't work

Opened a month ago by bill007, last comment 2 days ago by Jiří Spilka (jiri.spilka)

crawling takes longer

Opened a month ago by yener.yasin030, last comment a month ago by Jiří Spilka (jiri.spilka)

Trying to scrape OpenAI documentation

Opened a month ago by Digital_Mole, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Unable to retrieve key-value stores via API

Opened a month ago by shawnveltman, last comment a month ago by Jiří Spilka (jiri.spilka)

cant scrape, my request has failed

Opened a month ago by engaging_integrity, last comment a month ago by Jiří Spilka (jiri.spilka)

scraped data is redundant

Opened a month ago by visable, last comment 15 days ago by Jiří Spilka (jiri.spilka)

Not able to scrape the website, even with residential proxy

Opened a month ago by visable, last comment a month ago by Jiří Spilka (jiri.spilka)

Request for Assistance: Actor Timeout Issue & Custom API Output

Opened a month ago by RaulDC, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Integration for watch actor runs with make.com not working

Opened a month ago by smart.lean.ideas, last comment 2 days ago by Jiří Spilka (jiri.spilka)

Hung crawl

Opened a month ago by mcantrell-owner, last comment a month ago by Jiří Spilka (jiri.spilka)

Poor results

Opened a month ago by Digital_Mole, last comment 25 days ago by Jiří Spilka (jiri.spilka)

Integration with openai’s assistant through their functions calling tool

Opened a month ago by Cirula, last comment a month ago by Jiří Spilka (jiri.spilka)

"Run actor synchronously with input and get dataset items" supported?

Opened a month ago by Kilian, last comment a month ago by Dušan Vystrčil (dusan.vystrcil)

Blocked by cookies

Opened a month ago by andrewtshannon, last comment a month ago by andrewtshannon

Developer
Maintained by Apify

Actor Metrics

  • 5.4k monthly users

  • 990 bookmarks

  • >99% runs succeeded

  • 1 days response time

  • Created in Mar 2023

  • Modified 13 days ago