Convert LLM Tokens Per Second Into Words Per Minute

Translate raw model throughput into a human reading-speed number. Enter a generation rate and see how fast the stream feels, plus how long a full response takes.

Converter

Words / minute
Tokens / second
Time for response
ReferenceWords / minuteFeel
Average human reading~238baseline
Comfortable streaming~300–600natural
Skim-only / too fast to read900+instant

How the conversion works

Large language models emit text one token at a time, not one word at a time. A token is a sub-word fragment, so a word usually costs more than one token. Across typical English prompts the empirical average is about 1.33 tokens per word (roughly 0.75 words per token), though code, JSON, and non-Latin scripts push that ratio higher. This converter lets you set the ratio so the result matches your own workload instead of a generic constant.

The core formula is a unit conversion. To go from tokens per second to words per minute, divide the token rate by the tokens-per-word ratio to get words per second, then multiply by 60:

wpm = (tokens_per_sec ÷ tokens_per_word) × 60

The inverse direction simply rearranges the same equation:

tokens_per_sec = (wpm × tokens_per_word) ÷ 60

For example, a model streaming 40 tokens per second at 1.33 tokens per word produces about (40 ÷ 1.33) × 60 ≈ 1,805 words per minute — far faster than the ~238 wpm an average adult reads, which is why fast models feel like they finish before you can follow along. The response-timing field uses seconds = (response_words × tokens_per_word) ÷ tokens_per_sec to estimate wall-clock generation time for a full answer, ignoring the separate time-to-first-token latency.

Why bother converting at all? Tokens per second is the number vendors quote and the unit that drives cost and GPU sizing, but words per minute is the number that maps to perceived speed and reading comfort. Holding the two side by side helps you judge whether a cheaper, slower endpoint still streams faster than a user can read — in which case extra throughput buys nothing but bandwidth. The progress bar above scales perceived speed against a 600 wpm "comfortable streaming" reference so you can eyeball where a model lands.

Related Tools

LLM Throughput Calculator LLM Latency Estimator Token Counter