Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

View all Actors
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Do you want to learn more about this Actor?

Get a demo
HY

Crawler overcharges by several times

Closed

hyperlace opened this issue
10 days ago

Using the crawler overcharges by > 10x of the stated amount on the runs. I verified with an empty account (this one). The actor run prices / CUs don't add up to what's billed against the account.

Can you please clarify the situation / pricing.

jiri.spilka avatar

Hi, thank you for using the Website Content Crawler!

I understand your confusion; however, we have a data retention policy in place, which is detailed here: Apify Data Retention Policy.

Apify securely stores your ten most recent runs indefinitely, ensuring your records are always accessible. Unnamed datasets and runs beyond the latest ten will be automatically deleted after 7 days unless otherwise specified. Named datasets are retained indefinitely.

In your case, you executed 110 runs, but only the most recent 10 are visible in the console. In the proxies' usage, you can see that approximately 550 MB was transferred on October 23rd and 24th, corresponding to the remaining 100 runs.

I’ll close this issue now, but please feel free to reach out with any further questions.

Developer
Maintained by Apify
Actor metrics
  • 3.8k monthly users
  • 636 stars
  • 100.0% runs succeeded
  • 2.7 days response time
  • Created in Mar 2023
  • Modified 7 days ago