Prompt Token Estimator
Paste a prompt, pick a model family, see roughly how many tokens it will use. Different tokenisers split text differently, so the same paragraph can cost meaningfully more on Llama than on GPT. Drop in a price per 1K tokens and you also get a rough cost. All in your browser, nothing sent anywhere.
Explain like I'm 5 (what even is this calculator?)
Big AI models charge you per token, which is roughly a chunk of word. Different models chop text up slightly differently, which means the same prompt can cost a different amount depending on whose model is reading it. This tool uses each family's published rule of thumb to give you a sensible estimate before you press send.
Estimate the tokens
Heuristic, browser-only. No API calls, no sign-up, nothing leaves the page.
Tokens (approx)
0
Words
0
Characters
0
Estimated input cost
—
Prove it
Estimate a prompt above to see the working out.
Useful? Save this calculator: press Ctrl + D to bookmark it.
How the estimate works
The maths is a single division. Take the character count of your prompt, divide by the family's chars-per-token figure, round to the nearest whole token. GPT models sit at roughly 4.0, Claude at 3.8, Gemini at 4.1, Mistral at 3.7, and Llama at 3.5. Those numbers come from each provider's own back-of-envelope figures plus a fair bit of public sentencepiece and BPE tokeniser inspection. They are not exact for every prompt, because real tokenisers split words differently depending on what comes before and after, but they land within about 15% for normal English prose.
Why the families differ
OpenAI's tiktoken splits English fairly cleanly into common subwords. Anthropic's Claude tokeniser is similar but slightly tighter, which is why the same paragraph tends to cost a few percent more in tokens on Claude. Google's Gemini tokeniser is a little looser on English, which makes it look cheaper per token but does not always translate to a cheaper bill once you compare the actual per-token rates. Llama 3 and Mistral both use sentencepiece variants tuned for multilingual coverage, which means they split English into smaller pieces, more tokens per word, and therefore higher token counts for the same text.
What the estimate does not include
Three honest caveats. First, this is input tokens only. Output tokens are billed at a higher rate and depend on how much the model writes back, which is impossible to predict from the prompt alone. Second, code, JSON and non-Latin scripts tokenise very differently from English prose, often badly, so the rule of thumb drifts further on those. Third, the model providers publish exact tokenisers (tiktoken, the Anthropic tokeniser endpoint, Gemini's count-tokens API). If you need the precise figure for a contract or a bill, use those tools, not this one.
How to use this in practice
Paste a representative prompt, pick the model you are actually planning to use, and look at the token figure. If it surprises you on the high side, the usual fix is to trim the system prompt: long, hand-crafted prompts with twelve few-shot examples are reassuring to write but expensive to run a million times. If you have the per-1K input price to hand, drop it in for a sanity-check on the per-call cost. Multiply by your call volume and you have the rough monthly figure. For the full daily, monthly and annual breakdown across every major model, the LLM Token Usage Calculator does the comparison-table version of the same job.
Related calculators
Token count is one prompt. These zoom out to the workload around it.
Frequently asked questions
Why does the token count change when I switch model?
Each model family uses a different tokeniser. GPT models split English prose at roughly 4 characters per token. Anthropic's Claude tokeniser runs slightly tighter at about 3.8. Gemini's is a touch looser at around 4.1. Llama 3 and Mistral both use sentencepiece variants that produce more tokens per character (3.5 and 3.7 respectively). Same prompt, different bills.
How accurate is this estimate?
Within about 15% of the real tokeniser count for normal English prose. Code, JSON, non-Latin scripts and emoji-heavy text drift further. If you need the exact figure, paste the text into the provider's official tokeniser tool or call their count-tokens endpoint. For quick budget sense-checks, the heuristic is fine.
What goes in the price field?
The input price per 1,000 tokens for the model you picked, in dollars. You will find it on the provider's pricing page (often listed per 1M, in which case divide by 1000). The cost shown is for input only: output tokens are billed at a higher rate and depend on how much the model writes back, which this tool does not try to predict.
Does this tool send my prompt anywhere?
No. Everything runs in your browser. The text you paste does not leave your device, and there is no API call to a paid service. The estimate is a small piece of arithmetic, not a model invocation.
Why estimate at all, instead of running the real tokeniser?
The proper tokeniser bundles (tiktoken for OpenAI, Anthropic's tokeniser, Gemini's count-tokens endpoint) are either heavy enough to slow the page down meaningfully, locked behind a paid API, or both. The chars-per-token rule is the same shorthand the providers use in their own back-of-envelope docs. Good enough for a budget conversation, not good enough for a final invoice.