Website Content Crawler
No credit card required
Website Content Crawler
No credit card required
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.
Do you want to learn more about this Actor?
Get a demoUsing the crawler overcharges by > 10x of the stated amount on the runs. I verified with an empty account (this one). The actor run prices / CUs don't add up to what's billed against the account.
Can you please clarify the situation / pricing.
Hi, thank you for using the Website Content Crawler!
I understand your confusion; however, we have a data retention policy in place, which is detailed here: Apify Data Retention Policy.
Apify securely stores your ten most recent runs indefinitely, ensuring your records are always accessible. Unnamed datasets and runs beyond the latest ten will be automatically deleted after 7 days unless otherwise specified. Named datasets are retained indefinitely.
In your case, you executed 110 runs, but only the most recent 10 are visible in the console. In the proxies' usage, you can see that approximately 550 MB was transferred on October 23rd and 24th, corresponding to the remaining 100 runs.
I’ll close this issue now, but please feel free to reach out with any further questions.
- 3.8k monthly users
- 636 stars
- 100.0% runs succeeded
- 2.7 days response time
- Created in Mar 2023
- Modified 7 days ago