LLM Token Counter & Cost Estimator (Claude/GPT/Gemini/Llama) avatar

LLM Token Counter & Cost Estimator (Claude/GPT/Gemini/Llama)

Pricing

Pay per usage

Go to Apify Store
LLM Token Counter & Cost Estimator (Claude/GPT/Gemini/Llama)

LLM Token Counter & Cost Estimator (Claude/GPT/Gemini/Llama)

Count tokens for any text across 16+ models (Claude Opus/Sonnet/Haiku, GPT-4o, o3, Gemini 1.5, Llama, Mistral) and estimate per-million-token cost. Claude via Anthropic API (BYO key), GPT via tiktoken, others via heuristic. $0.001 per text counted.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Hojun Lee

Hojun Lee

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

LLM Token Counter & Cost Estimator

Count tokens for any text across 16+ models (Claude Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT-4o, o3, Gemini 1.5, Llama 3, Mistral) and estimate per-call cost in USD. Claude via Anthropic SDK (BYO key, free endpoint), GPT via tiktoken (no key needed), others via heuristic. $0.001 per text counted.


Why this exists

Every LLM engineer asks the same question 50 times a day: "if I send this prompt, how many tokens — and how much will it cost?"

Existing solutions:

  • OpenAI tokenizer tool: web-only, OpenAI models only
  • Anthropic count_tokens API: free but requires writing the SDK call
  • Chargify / LLM-monitor SaaS: $50+/mo

This actor wraps all the major tokenizers and prices in one call. Drop a prompt in, get a sorted cost comparison across every model that could handle it.


What you get

Summary row

{
"_type": "summary",
"char_count": 5230,
"models_compared": 16,
"cheapest_model": "gemini-1.5-flash",
"cheapest_total_usd": 0.000405,
"most_expensive_model": "claude-opus-4-5"
}

Per-model row (one per model)

{
"_type": "model_count",
"model": "claude-opus-4-7",
"input_tokens": 1320,
"chars": 5230,
"method": "anthropic_count_tokens",
"input_price_per_1m": 5.00,
"output_price_per_1m": 25.00,
"max_output_tokens": 1024,
"input_cost_usd": 0.0066,
"max_output_cost_usd": 0.0256,
"total_cost_usd": 0.0322
}

Rows are sorted by input cost ascending — so the cheapest option is the first model row.


Supported models

ModelTokenizerPricing source
claude-opus-4-7Anthropic count_tokens (or cl100k proxy)$5 / $25 per 1M
claude-opus-4-6same$5 / $25
claude-sonnet-4-6same$3 / $15
claude-haiku-4-5same$1 / $5
gpt-4otiktoken o200k_base$2.50 / $10
gpt-4o-minisame$0.15 / $0.60
gpt-4-turbocl100k_base$10 / $30
gpt-3.5-turbocl100k_base$0.50 / $1.50
o3o200k_base$15 / $60
gemini-1.5-proheuristic ~4 chars/token$1.25 / $5
gemini-1.5-flashheuristic$0.075 / $0.30
llama-3-70bheuristic$0.59 / $0.79
mistral-largeheuristic$2 / $6

Quick start

Single text

{
"text": "Hello, world. Today I am going to..."
}

From URL

{
"textUrl": "https://en.wikipedia.org/wiki/Bitcoin"
}
{
"text": "...",
"anthropicApiKey": "sk-ant-..."
}

Compare only models you care about

{
"text": "...",
"models": ["claude-opus-4-7", "claude-sonnet-4-6", "gpt-4o", "gemini-1.5-pro"]
}

Realistic cost (set expected output length)

{
"text": "...",
"maxOutputTokens": 4000
}

Use cases

  1. Cost-sensitive prompt design — Try Opus 4.7 vs Sonnet 4.6 vs Haiku for the same prompt; pick cheapest that still works
  2. RAG chunk sizing — Check if your retrieval chunks + system prompt fit in the context window
  3. Batch budget forecasting — Run on 100 sample inputs → multiply for full corpus
  4. Provider comparison — At what input length does Claude become cheaper than GPT? This actor answers in one call
  5. Education — Show students why "translate this novel" hits the wall fast

Pricing

Pay-Per-Event: $0.001 per text counted (regardless of how many models).

Vs Helicone ($25-100/mo for cost tracking), LangSmith, or building your own.


How counting works per provider

  • OpenAI (gpt-4o family) — exact via tiktoken library
  • Claude — exact via Anthropic SDK count_tokens (free endpoint, requires anthropicApiKey). Without the key, falls back to cl100k tokenizer as a proxy (typically within ±5%).
  • Gemini / Llama / Mistral — heuristic: ~4 chars per token. Accuracy varies by language. For Korean / Japanese / Chinese this overestimates by ~20%.


Feedback

A short review helps LLM engineers find it: Leave a review on Apify Store