AI calculators

Q: How do I budget for an AI feature in my app?

Estimate your per-call token usage (input plus output), multiply by your model's per-token rate, multiply by expected calls per user per month, then multiply by your active user count. Add a 30 to 50 per cent buffer for retries, longer-than-expected prompts, and growth.

Tools for working with AI APIs and large language models. Cost projection, token estimation, and the sort of sense-checking you do before signing off a budget. Browser-based, no sign-up, and no API keys required (because nothing here calls anyone's API).

What AI calculators actually solve for

Every team that starts using a language model in production hits the same uncomfortable surprise around month two: the bill. The demo prompt that cost half a cent is now a feature, the feature has users, and someone in finance wants to know why a single product page is throwing four-figure invoices at OpenAI. This is the territory these tools are built for. Not the philosophical "should we use AI" question, the practical "what is this going to cost on Tuesday" question. You can plan a launch around an estimate. You cannot plan a launch around a vibe.

The other thing they solve is the comparison problem. Provider A charges per million input tokens, provider B charges per character, provider C bundles input and output, and provider D has a tiered discount that only kicks in past a usage threshold. Sticking these on the same chart in your head is harder than it should be. A calculator that holds the unit conversions and the pricing tables in one place lets you compare apples to apples in about thirty seconds, which is the only honest way to choose between Claude, GPT, Gemini and the rest.

Where the inputs come from

Good estimates start with realistic inputs, and most teams underestimate at least one of them. Tokens per request is the obvious one: paste a representative prompt into the Prompt Token Estimator rather than guessing, because rich-text prompts with examples can be five or ten times longer than people remember. Output length is the other half of the bill, and output tokens are usually charged at a higher rate than input. If you are streaming long responses, that side dominates.

Volume is where assumptions get sloppy. "A few thousand a day" sounds modest until you multiply it by twelve months of growth. Pull a real number from your analytics, your support inbox, or whatever channel actually triggers the call. For retrieval-augmented setups, the RAG Pipeline Cost Calculator separates the embedding cost (one-off, then refresh) from the per-query cost (every single search), which is where most back-of-envelope budgets fall apart.

Common mistakes when budgeting AI

The first one is treating the cheapest model as the obvious choice. A small model that needs three retries to get a usable answer is not cheaper than a larger model that gets it right first time. Output quality has a cost too, and it shows up in user trust, support tickets, and the engineering hours spent writing prompt patches.

The second is forgetting about embeddings and storage. If you are building anything with a vector database, the embedding cost is often a smaller line item than people fear, but the vector store hosting and the re-embedding when you change models can sneak up on you. The Embedding Cost Calculator will get you a one-off and a recurring number for the same corpus.

The third is ignoring hallucination risk in the budget conversation. A wrong answer that goes to a customer is a future support ticket, sometimes a refund, occasionally a regulatory letter. Bake that into the cost case, not just the per-token rate. The Hallucination Risk Calculator gives you a structured way to score it before you ship.

If you are picking between tools, start with the one closest to the question you actually have. Budgeting a chatbot launch? Token usage. Building a knowledge base? RAG pipeline. Comparing freelance writers to a model? The AI vs human writer tool. The numbers all line up; the differences are in which inputs each tool asks for.

LLM Token Usage Calculator

Estimate the per-call, daily, monthly and annual cost of running a language model workload. Side-by-side comparison across Claude, GPT, Gemini and DeepSeek. Pricing baked in and dated, so you know how fresh the figures are.
AI Image Generation Cost Calculator

Estimate the per-image, monthly and annual cost of generating AI images. Side-by-side comparison across DALL-E 3, Midjourney, Stable Diffusion XL, Imagen 3 and Flux Pro, with pricing dated for transparency.
RAG Pipeline Cost Calculator

Estimate the monthly cost of running a Retrieval-Augmented Generation pipeline. Embedding, vector database, and LLM query costs broken out, with a month-by-month projection and an LLM swap comparison.
AI vs Human Writer Cost Calculator

Compare the monthly cost of producing written content with an LLM against paying a human writer. Per-word or per-hour pricing, every model side by side, and an honest section on the editorial trade-offs the raw bill does not show.
AI Fine-Tuning Cost Calculator

Estimate the cost of fine-tuning an LLM. Training cost across GPT-4o, GPT-4o-mini, Llama-3 on Together, and Mistral-7B, with the per-1000-call hosted inference cost for the tuned model.
AI Hallucination Risk Calculator

Score the relative risk that an LLM output will hallucinate, given the model class, task, grounding and verification you have in place. Heuristic, browser-only, with the maths shown.
Embedding Cost Calculator

Estimate the cost of generating embeddings for a corpus across OpenAI, Cohere, Voyage or a custom price. One-off, monthly, weekly or daily refresh, with total over the period and per-1000-docs cost.
Prompt Token Estimator

Paste a prompt, pick a model family, see roughly how many tokens it will use across GPT-4o, Claude, Gemini, Llama and Mistral. Optional cost at your $/1K rate.

Frequently asked questions

How are tokens different from words and characters?

A token is roughly three to four characters of English, or about 0.75 of a word. "Hello world" is two tokens. "Antidisestablishmentarianism" is closer to seven, because the tokeniser splits long unfamiliar words into chunks. The Prompt Token Estimator gives you a quick count for any prompt across the major model families.

Roughly what does a GPT-class API call cost?

For mid-tier models (GPT-4o-mini, Claude Haiku, Gemini Flash) in 2026, a typical 2,000-token input plus 500-token output costs about half a US cent. For top-tier models (GPT-4o, Claude Opus, Gemini Pro) the same call is closer to three to five cents. The LLM Token Usage Calculator works it out per call and per month for your specific usage.

When does fine-tuning beat a longer prompt?

Fine-tuning earns its keep when you have at least a few hundred good examples of the input-output pattern you want, when you need shorter prompts at inference time (saving tokens on every call), and when you can amortise the up-front training cost across enough calls. Below a few thousand calls a month, prompting almost always wins. The Fine-Tuning Cost Calculator shows the break-even point.

What is RAG and when should I use it?

RAG (Retrieval-Augmented Generation) means looking up relevant chunks from your own documents and stuffing them into the prompt at runtime. Use it when answers must come from a specific corpus (your docs, your knowledge base) rather than the model's training data. The RAG Pipeline Cost Calculator models the embedding and inference costs for a typical setup.

How do I budget for an AI feature in my app?

Estimate your per-call token usage (input + output), multiply by your model's per-token rate, multiply by expected calls per user per month, then multiply by your active user count. Add a 30 to 50 per cent buffer for retries, longer-than-expected prompts, and growth. The tools in this category each handle one slice of that calculation.