LLM Pricing History — How API Costs Dropped 100x Since 2023
Complete timeline of LLM API pricing changes across every major provider. Track how costs collapsed from $30/1M tokens to $0.15/1M tokens in just three years.
By Michael Lip · Updated April 2026
Methodology
All prices are sourced from official API pricing pages, provider announcements, and archived pricing snapshots. Prices shown are per 1 million tokens (input unless noted). Self-hosted costs are estimated based on cloud GPU pricing (A100/H100 at market rates) and published throughput benchmarks. Stack Overflow pricing discussions queried via public API on April 11, 2026. All amounts in USD.
| Date | Model | Provider | Input $/1M | Output $/1M | Event | Drop from Prior |
|---|---|---|---|---|---|---|
| 2023-03 | GPT-4 (8K) | OpenAI | $30.00 | $60.00 | Launch | — |
| 2023-03 | GPT-3.5 Turbo | OpenAI | $2.00 | $2.00 | Launch | — |
| 2023-03 | Claude 1 | Anthropic | $11.02 | $32.68 | Launch | — |
| 2023-07 | Claude 2 | Anthropic | $8.00 | $24.00 | Launch | -27% |
| 2023-07 | Llama 2 70B | Meta | Free* | Free* | Open-source release | — |
| 2023-11 | GPT-4 Turbo | OpenAI | $10.00 | $30.00 | Launch (3x cheaper than GPT-4) | -67% |
| 2023-11 | GPT-3.5 Turbo (new) | OpenAI | $1.00 | $2.00 | Price cut | -50% |
| 2024-02 | Gemini 1.0 Pro | $0.50 | $1.50 | Launch | — | |
| 2024-03 | Claude 3 Opus | Anthropic | $15.00 | $75.00 | Launch (frontier) | — |
| 2024-03 | Claude 3 Sonnet | Anthropic | $3.00 | $15.00 | Launch | — |
| 2024-03 | Claude 3 Haiku | Anthropic | $0.25 | $1.25 | Launch (budget tier) | — |
| 2024-04 | Llama 3 70B | Meta | Free* | Free* | Open-source release | — |
| 2024-05 | GPT-4o | OpenAI | $5.00 | $15.00 | Launch (50% cheaper than Turbo) | -50% |
| 2024-05 | Gemini 1.5 Pro | $3.50 | $10.50 | Launch | — | |
| 2024-05 | Gemini 1.5 Flash | $0.35 | $1.05 | Launch | — | |
| 2024-07 | GPT-4o mini | OpenAI | $0.15 | $0.60 | Launch (replaces 3.5 Turbo) | -97% vs GPT-4 |
| 2024-07 | Llama 3.1 405B | Meta | Free* | Free* | Largest open model | — |
| 2024-09 | o1-preview | OpenAI | $15.00 | $60.00 | Reasoning model launch | — |
| 2024-10 | GPT-4o (reduced) | OpenAI | $2.50 | $10.00 | Price cut | -50% |
| 2024-10 | Claude 3.5 Sonnet | Anthropic | $3.00 | $15.00 | Launch (same price, better quality) | — |
| 2024-11 | Gemini 1.5 Flash (reduced) | $0.075 | $0.30 | Price cut | -79% | |
| 2024-11 | Gemini 1.5 Pro (reduced) | $1.25 | $5.00 | Price cut | -64% | |
| 2024-12 | DeepSeek V3 | DeepSeek | $0.27 | $1.10 | Launch (frontier quality at flash price) | — |
| 2025-01 | DeepSeek R1 | DeepSeek | $0.55 | $2.19 | Open reasoning model | — |
| 2025-02 | GPT-4.5 | OpenAI | $75.00 | $150.00 | Launch (premium research tier) | — |
| 2025-03 | Gemini 2.5 Pro | $1.25 | $10.00 | Launch | — | |
| 2025-03 | Gemini 2.5 Flash | $0.15 | $0.60 | Launch | — | |
| 2025-04 | o3 | OpenAI | $10.00 | $40.00 | Launch | -33% vs o1 |
| 2025-04 | o4-mini | OpenAI | $1.10 | $4.40 | Launch (budget reasoning) | — |
| 2025-06 | Claude Opus 4.6 | Anthropic | $15.00 | $75.00 | Launch (frontier) | — |
| 2025-06 | Claude Sonnet 4 | Anthropic | $3.00 | $15.00 | Launch | — |
| 2025-10 | Claude Haiku 3.5 | Anthropic | $0.80 | $4.00 | Launch | — |
| 2025-10 | Mistral Small 3.1 | Mistral | $0.10 | $0.30 | Launch | — |
Key Milestones
| Milestone | Date | Significance |
|---|---|---|
| GPT-4 Launch | 2023-03 | Set the $30/1M input benchmark for frontier models |
| Llama 2 Open-Source | 2023-07 | First competitive open-weights model, triggered price war |
| GPT-3.5 Turbo price halved | 2023-11 | Signal that budget tier would race to zero |
| GPT-4o mini at $0.15 | 2024-07 | 97% cheaper than original GPT-4 with comparable quality |
| DeepSeek V3 at $0.27 | 2024-12 | Frontier-quality open model at flash prices |
| Gemini 2.5 Flash at $0.15 | 2025-03 | Google matches GPT-4o mini price with 1M context |
| GPT-4.5 at $75 | 2025-02 | Premium research tier creates new price ceiling |
Frequently Asked Questions
How much have LLM API prices dropped since 2023?
LLM API prices have dropped approximately 100x at the frontier tier and over 500x at the budget tier since early 2023. GPT-3.5 Turbo launched at $2.00/1M input tokens and equivalent-capability models now cost $0.01-0.03/1M tokens. GPT-4 launched at $30/1M input; today's comparable-quality GPT-4o costs $2.50/1M (12x cheaper). The biggest drops came from competition, hardware improvements, and distillation.
Why did LLM prices drop so dramatically?
Four main factors: (1) Competition from open-source models (Llama, Mistral) forced proprietary providers to lower prices; (2) Hardware improvements including custom chips reduced inference costs; (3) Architectural innovations like Mixture-of-Experts reduced compute per token; (4) Distillation and quantization enabled smaller models to match larger model quality at a fraction of the cost.
Which LLM API is cheapest in 2026?
As of April 2026, the cheapest APIs are: Mistral Small 3.1 at $0.10/1M input, Gemini 2.5 Flash and GPT-4o mini at $0.15/1M, DeepSeek V3 at $0.27/1M. For self-hosted, Llama 3.3 70B on Groq or Together AI can cost under $0.50/1M tokens. The cheapest option depends on quality requirements, latency needs, and volume.
Will LLM prices continue to drop?
Prices are expected to continue declining but at a slower rate. The rapid 100x drops from 2023-2025 were driven by one-time factors (competition entry, architecture shifts). Future reductions will come from incremental hardware improvements, better quantization, and speculative decoding. Reasoning models may cost more per token due to extended compute, creating a two-tier structure.
How do I estimate my LLM API costs?
Use KickLLM's free calculator at kickllm.com. Key inputs: average input tokens per request, average output tokens, requests per day, and your chosen model. Benchmarks: a typical chatbot message is 100-500 input tokens. A RAG query with context is 2,000-8,000 input tokens. Batch processing with long documents can be 50,000-128,000 input tokens per request.