Website Content Crawler
No credit card required
Website Content Crawler
No credit card required
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
Do you want to learn more about this Actor?
Get a demoCombining startUrl and includeUrlGlob
Opened a day ago by nauticallygreat, last comment a day ago by nauticallygreat
Crawling logic
Opened a day ago by nauticallygreat, last comment a day ago by nauticallygreat
Why did it run for 8 hours, isnt' there a hard limit of 9 minutes?
Opened 3 days ago by stevecasey1213, last comment 3 days ago by Jiří Spilka (jiri.spilka)
Does not extract anything frrom the provided website
Opened 5 days ago by ennsharma, last comment 5 days ago by ennsharma
page scrolling
Opened 6 days ago by kempt_trophy, last comment 3 days ago by kempt_trophy
Crawler does not extract content, with no useful logs to debug
Opened 6 days ago by nimble_caretaker, last comment 5 days ago by nimble_caretaker
Not much of value scrapped.
Opened 6 days ago by methodical, last comment 6 days ago by Jiří Spilka (jiri.spilka)
Crawler overcharges by several times
Opened 10 days ago by hyperlace, last comment 10 days ago by Jiří Spilka (jiri.spilka)
12 mins in and nothing has been crawled
Opened 12 days ago by callumdownie, last comment 5 days ago by Jiří Spilka (jiri.spilka)
crawling stuck
Opened 12 days ago by greenforestpath8, last comment 12 days ago by Jindřich Bär (jindrich.bar)
Too complex to start this process
Opened 14 days ago by rongwroom, last comment 13 days ago by Jiří Spilka (jiri.spilka)
Custom user agent
Opened 14 days ago by civic-roundtable, last comment 12 days ago by Jiří Spilka (jiri.spilka)
I can't send data from Clay to Apify.
Opened 19 days ago by romeoman, last comment 10 days ago by Jiří Spilka (jiri.spilka)
Access blocked everytime
Opened 23 days ago by saidur297, last comment 18 days ago by Jiří Spilka (jiri.spilka)
Various questions about operation and optimization of website content crawler
Opened 24 days ago by David Haddad (davhad), last comment 17 days ago by Jiří Spilka (jiri.spilka)
Actor simply doesn't work
Opened 25 days ago by xylonic_gloves, last comment 22 days ago by Jiří Spilka (jiri.spilka)
Insane usage time
Opened 25 days ago by xylonic_gloves, last comment 25 days ago by Jiří Spilka (jiri.spilka)
Doesn't scape table on website
Opened a month ago by esouthwick, last comment a month ago by Jindřich Bär (jindrich.bar)
Sitemap discovery takes long time (15 minutes)
Opened a month ago by Jiří Spilka (jiri.spilka), last comment 11 days ago by Jindřich Bär (jindrich.bar)
Website Scrape doesn't include Social Media Links
Opened a month ago by Synergize_AI, last comment a month ago by Jindřich Bär (jindrich.bar)
- 3.8k monthly users
- 635 stars
- 100.0% runs succeeded
- 2.7 days response time
- Created in Mar 2023
- Modified 7 days ago