Website Content Crawler
No credit card required
Website Content Crawler
No credit card required
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
Do you want to learn more about this Actor?
Get a demoAfter the crawl is successful. The WCC data is not getting pushed to Pinecode. I setup the WCC and Pinecode integration and selected to auto push to Pinecode. Getting erros and No data in Pinecode.
Hello, we investigated this issue and it seems like it's an Apify console issue and we're working on fixing it. Thank you for reporting it!
The fix for this issue was deployed today. Let us know if you have any other problems.
This is what I see under run. But when I check Pinecone it shows zero records.
This is what pinecone shows.
Hello, Still on data in pinecone. See my comments I added. Looks like its not triggering.
No Data.
Hello, the problem is that Pinecone saves all the text in the Metadata, so this can happen when the content is big. It's recommended to chunk the data. You should change 'performChunking' to true in the Pinecone integration, which will fix the issue.
Actor Metrics
4k monthly users
-
839 stars
>99% runs succeeded
1 days response time
Created in Mar 2023
Modified 17 hours ago