What Is the Cheapest LLM API in 2026?
As of April 2026, the cheapest LLM APIs: Gemini 2.0 Flash at $0.10/$0.40 per 1M tokens, DeepSeek V3 at $0.27/$1.10, Mistral Small 3.1 at $0.10/$0.30. For open models via Groq: Llama 3 8B at $0.05/$0.08.
Cheapest Proprietary APIs
| Model | Input (per 1M) | Output (per 1M) | Provider |
|---|---|---|---|
| Gemini 2.0 Flash | $0.10 | $0.40 | |
| Mistral Small 3.1 | $0.10 | $0.30 | Mistral |
| DeepSeek V3 | $0.27 | $1.10 | DeepSeek |
| GPT-4o Mini | $0.15 | $0.60 | OpenAI |
| Claude Haiku 3.5 | $0.80 | $4.00 | Anthropic |
Cheapest Open-Source via Inference APIs
| Model | Input (per 1M) | Output (per 1M) | Provider |
|---|---|---|---|
| Llama 3 8B | $0.05 | $0.08 | Groq |
| Llama 3 70B | $0.59 | $0.79 | Groq |
| Mixtral 8x7B | $0.24 | $0.24 | Groq |
| Llama 3.1 8B | $0.20 | $0.20 | Together.ai |
Key Takeaways
- For the absolute cheapest API calls, Groq's Llama 3 8B at $0.05/$0.08 per 1M tokens is unbeatable
- For the best quality-per-dollar among proprietary models, Gemini 2.0 Flash is the clear winner
- Mistral Small 3.1 has the cheapest output pricing at $0.30/1M tokens among proprietary options
- For zero cost, run Ollama locally on a Mac M-series — quality matches small hosted models
Calculate Your Costs
Use KickLLM's cost calculator to estimate your monthly spend based on your actual usage patterns — tokens per request, requests per day, and model choice.
Calculate your LLM API costs with KickLLM — free, no sign-up required.