Original Research

LLM Pricing History — How API Costs Dropped 100x Since 2023

Complete timeline of LLM API pricing changes across every major provider. Track how costs collapsed from $30/1M tokens to $0.15/1M tokens in just three years.

By Michael Lip · Updated April 2026

Methodology

All prices are sourced from official API pricing pages, provider announcements, and archived pricing snapshots. Prices shown are per 1 million tokens (input unless noted). Self-hosted costs are estimated based on cloud GPU pricing (A100/H100 at market rates) and published throughput benchmarks. Stack Overflow pricing discussions queried via public API on April 11, 2026. All amounts in USD.

Date Model Provider Input $/1M Output $/1M Event Drop from Prior
2023-03GPT-4 (8K)OpenAI$30.00$60.00Launch
2023-03GPT-3.5 TurboOpenAI$2.00$2.00Launch
2023-03Claude 1Anthropic$11.02$32.68Launch
2023-07Claude 2Anthropic$8.00$24.00Launch-27%
2023-07Llama 2 70BMetaFree*Free*Open-source release
2023-11GPT-4 TurboOpenAI$10.00$30.00Launch (3x cheaper than GPT-4)-67%
2023-11GPT-3.5 Turbo (new)OpenAI$1.00$2.00Price cut-50%
2024-02Gemini 1.0 ProGoogle$0.50$1.50Launch
2024-03Claude 3 OpusAnthropic$15.00$75.00Launch (frontier)
2024-03Claude 3 SonnetAnthropic$3.00$15.00Launch
2024-03Claude 3 HaikuAnthropic$0.25$1.25Launch (budget tier)
2024-04Llama 3 70BMetaFree*Free*Open-source release
2024-05GPT-4oOpenAI$5.00$15.00Launch (50% cheaper than Turbo)-50%
2024-05Gemini 1.5 ProGoogle$3.50$10.50Launch
2024-05Gemini 1.5 FlashGoogle$0.35$1.05Launch
2024-07GPT-4o miniOpenAI$0.15$0.60Launch (replaces 3.5 Turbo)-97% vs GPT-4
2024-07Llama 3.1 405BMetaFree*Free*Largest open model
2024-09o1-previewOpenAI$15.00$60.00Reasoning model launch
2024-10GPT-4o (reduced)OpenAI$2.50$10.00Price cut-50%
2024-10Claude 3.5 SonnetAnthropic$3.00$15.00Launch (same price, better quality)
2024-11Gemini 1.5 Flash (reduced)Google$0.075$0.30Price cut-79%
2024-11Gemini 1.5 Pro (reduced)Google$1.25$5.00Price cut-64%
2024-12DeepSeek V3DeepSeek$0.27$1.10Launch (frontier quality at flash price)
2025-01DeepSeek R1DeepSeek$0.55$2.19Open reasoning model
2025-02GPT-4.5OpenAI$75.00$150.00Launch (premium research tier)
2025-03Gemini 2.5 ProGoogle$1.25$10.00Launch
2025-03Gemini 2.5 FlashGoogle$0.15$0.60Launch
2025-04o3OpenAI$10.00$40.00Launch-33% vs o1
2025-04o4-miniOpenAI$1.10$4.40Launch (budget reasoning)
2025-06Claude Opus 4.6Anthropic$15.00$75.00Launch (frontier)
2025-06Claude Sonnet 4Anthropic$3.00$15.00Launch
2025-10Claude Haiku 3.5Anthropic$0.80$4.00Launch
2025-10Mistral Small 3.1Mistral$0.10$0.30Launch

Key Milestones

Milestone Date Significance
GPT-4 Launch2023-03Set the $30/1M input benchmark for frontier models
Llama 2 Open-Source2023-07First competitive open-weights model, triggered price war
GPT-3.5 Turbo price halved2023-11Signal that budget tier would race to zero
GPT-4o mini at $0.152024-0797% cheaper than original GPT-4 with comparable quality
DeepSeek V3 at $0.272024-12Frontier-quality open model at flash prices
Gemini 2.5 Flash at $0.152025-03Google matches GPT-4o mini price with 1M context
GPT-4.5 at $752025-02Premium research tier creates new price ceiling

Frequently Asked Questions

How much have LLM API prices dropped since 2023?

LLM API prices have dropped approximately 100x at the frontier tier and over 500x at the budget tier since early 2023. GPT-3.5 Turbo launched at $2.00/1M input tokens and equivalent-capability models now cost $0.01-0.03/1M tokens. GPT-4 launched at $30/1M input; today's comparable-quality GPT-4o costs $2.50/1M (12x cheaper). The biggest drops came from competition, hardware improvements, and distillation.

Why did LLM prices drop so dramatically?

Four main factors: (1) Competition from open-source models (Llama, Mistral) forced proprietary providers to lower prices; (2) Hardware improvements including custom chips reduced inference costs; (3) Architectural innovations like Mixture-of-Experts reduced compute per token; (4) Distillation and quantization enabled smaller models to match larger model quality at a fraction of the cost.

Which LLM API is cheapest in 2026?

As of April 2026, the cheapest APIs are: Mistral Small 3.1 at $0.10/1M input, Gemini 2.5 Flash and GPT-4o mini at $0.15/1M, DeepSeek V3 at $0.27/1M. For self-hosted, Llama 3.3 70B on Groq or Together AI can cost under $0.50/1M tokens. The cheapest option depends on quality requirements, latency needs, and volume.

Will LLM prices continue to drop?

Prices are expected to continue declining but at a slower rate. The rapid 100x drops from 2023-2025 were driven by one-time factors (competition entry, architecture shifts). Future reductions will come from incremental hardware improvements, better quantization, and speculative decoding. Reasoning models may cost more per token due to extended compute, creating a two-tier structure.

How do I estimate my LLM API costs?

Use KickLLM's free calculator at kickllm.com. Key inputs: average input tokens per request, average output tokens, requests per day, and your chosen model. Benchmarks: a typical chatbot message is 100-500 input tokens. A RAG query with context is 2,000-8,000 input tokens. Batch processing with long documents can be 50,000-128,000 input tokens per request.