Claude API Pricing Guide 2026

Compare Haiku, Sonnet, and Opus costs per million tokens. Calculate monthly spend for your specific workload.

Current Claude API Pricing

Anthropic offers three tiers of Claude models, each designed for different workload profiles. Pricing is based on input and output tokens separately, which means the cost of a request depends on both how much context you send and how long the response is. All prices below are per one million tokens.

ModelInput / 1M tokensOutput / 1M tokensContext Window
Claude 3.5 Haiku$0.25$1.25200K
Claude 3.5 Sonnet$3.00$15.00200K
Claude 3.5 Opus$15.00$75.00200K

These prices apply to the standard real-time API. Anthropic also offers prompt caching, which reduces the cost of repeated prefixes in your prompts. Cached input tokens cost 90% less than standard input tokens, making it particularly effective for system prompts and few-shot examples that remain constant across requests.

When to Use Each Claude Tier

Choosing the right Claude model is the single most impactful cost decision you will make. Running Opus when Haiku would suffice can inflate your bill by 60x. Here is a practical breakdown of when each tier makes sense.

Claude Haiku is ideal for high-volume, low-complexity tasks: content classification, entity extraction, simple question answering, data validation, and routing queries to more expensive models. At $0.25 per million input tokens, you can process roughly 4 million requests per dollar with short prompts. Haiku responds in under 500 milliseconds for most requests, making it suitable for latency-sensitive applications.

Claude Sonnet is the workhorse for production applications. It handles multi-step reasoning, code generation, summarization, and conversational AI with quality that approaches Opus at one-fifth the cost. Most teams should default to Sonnet and only escalate to Opus for tasks where Sonnet measurably underperforms. At $3/$15 per million tokens, a chatbot handling 10,000 conversations per day with 2,000 tokens each costs roughly $960/month.

Claude Opus is reserved for tasks requiring the highest level of reasoning: complex code architecture decisions, nuanced legal or medical text analysis, multi-document synthesis, and advanced mathematical proofs. Given its $15/$75 pricing, Opus is best used selectively, either for low-volume high-value tasks or as a quality benchmark during evaluation.

Cost Optimization Tips

There are several strategies to reduce your Claude API bill without sacrificing quality. First, implement a model routing layer that sends simple requests to Haiku and only escalates to Sonnet or Opus based on complexity signals. This alone can reduce costs by 50-80% for mixed workloads.

Second, use Anthropic's Batch API for non-real-time workloads. The Batch API processes requests within a 24-hour window and offers a 50% discount across all models. This makes Sonnet batch processing cost just $1.50/$7.50 per million tokens, which is competitive with many open-source hosted alternatives.

Third, leverage prompt caching for repetitive system prompts. If your system prompt is 4,000 tokens and you send 100,000 requests per day, caching saves approximately $1,080 per month on Sonnet. Fourth, minimize output tokens by instructing the model to be concise and using structured output formats like JSON, which tend to be more token-efficient than natural language responses.

Finally, consider the total cost of ownership. While Haiku is the cheapest per token, a task that requires three Haiku retries to get right may cost more than a single Sonnet call. Track your success rates per model tier and optimize for cost-per-successful-completion rather than raw token cost.

Frequently Asked Questions

How much does Claude API cost per month?

Monthly costs depend on your model choice and usage volume. At 1 million tokens per day, Claude Haiku costs roughly $22/month, Sonnet costs $270/month, and Opus costs $1,350/month. Use the KickLLM calculator for exact estimates based on your workload.

What is the difference between Claude Haiku, Sonnet, and Opus?

Haiku ($0.25/$1.25 per 1M tokens) is the fastest and cheapest, ideal for classification and simple tasks. Sonnet ($3/$15) balances quality and cost for most production workloads. Opus ($15/$75) is the most capable model for complex reasoning, coding, and analysis.

Does Anthropic offer batch API discounts for Claude?

Yes. Anthropic's Batch API provides a 50% discount on all Claude models. Batch requests are processed within 24 hours rather than in real-time, making them ideal for bulk processing, evaluations, and offline workloads.

How does Claude pricing compare to GPT-4?

Claude Sonnet ($3/$15) is cheaper on input tokens than GPT-4o ($5/$15) with identical output pricing. Claude Haiku ($0.25/$1.25) significantly undercuts GPT-4o mini ($0.15/$0.60) on output cost but is slightly more expensive on input.

What is the Claude API context window and how does it affect cost?

Claude models support up to 200K tokens of context. Longer contexts increase input token costs proportionally. For cost efficiency, keep prompts concise and use retrieval-augmented generation (RAG) instead of stuffing entire documents into the context window.

Related Guides

GPT-4 API Pricing Guide LLM Model Comparison LLM API Cost Calculator Guide

Built by Michael Lip. Pricing data updated regularly from official provider pages.