Estimate Your LLM Cost Per Chatbot Conversation
Model the true per-conversation token spend of an LLM chatbot — including multi-turn context replay — then scale it to a monthly bill across your active user base.
Calculator
How the cost is calculated
Most cost calculators multiply a single prompt by its price and stop. Real chatbots are stateful: every turn replays the entire prior conversation plus the system prompt, so input tokens grow quadratically across a session. This tool models that replay explicitly, which is why its per-conversation numbers run higher (and truer) than a naive single-call estimate.
The replay formula
For a conversation of T turns, turn n sends the system prompt plus all previous user and bot messages plus the new user message. Billed input tokens accumulate as:
inputTokens = Σ(n=1..T) [ sys + (n−1)·(inTok + outTok) + inTok ]
Output tokens are simpler — the model only generates a fresh reply each turn: outputTokens = T · outTok. The closed form for the growing history term is (inTok+outTok)·T·(T−1)/2, the classic arithmetic-series sum, which is what the script evaluates.
Pricing and caching
Cost is (inputTokens/1e6)·inPrice + (outputTokens/1e6)·outPrice. When you enable a cached-input discount, the repeated system-prompt and history portion is re-priced at the reduced rate — modeling prompt-caching features where unchanged prefix tokens bill at a fraction of the standard input price. Only the genuinely new user tokens each turn stay full price.
Scaling to a monthly bill
Monthly spend is per-conversation cost times conversations-per-user times active users. The breakdown table separates input from output dollars so you can see whether your bot is context-heavy (cut the system prompt or add caching) or generation-heavy (cap max_tokens or use a cheaper output tier). Trim turns, shrink the RAG window, or raise cache hit rate and watch the monthly figure move in real time.