LLM / providers / anthropic

Anthropic Claude API Pricing

Anthropic's Claude lineup runs from a frontier Fable/Opus tier down to a budget Haiku tier, with prompt caching and batch discounts available across the range and a large context window on most models.

Prices verified June 2026 · changes logged in the changelog
Heads up: Caching and batch behaviour is consistent across the lineup, but the budget tier ships with a smaller context window than the frontier and mid tiers — confirm the window on the model you pick matches your longest prompt.
Model$ input /1M$ output /1M$ cached /1MBatch≈ $/mo *
Claude Fable 5FRONTIER $10$50$1−50%$2,240
Claude Opus 4.8FRONTIER $5$25$0.50−50%$1,120
Claude Sonnet 4.6MID $3$15$0.30−50%$672
Claude Haiku 4.5BUDGET $1$5$0.10−50%$224

* Example workload — chatbot, 100k requests/mo, 2,000 input / 300 output tokens per request, 70% of input cached. Computed by the same engine as the calculator. Batch: the −50% is Anthropic's verified Batch API discount; the ≈ $/mo column is computed without it.

Prompt caching

Cached input is billed at 10% of the input rate across all Claude models we track — a major lever for chatbots and agents where most of the prompt repeats. The calculator models this with your cache share.

Batch / async

The Batch API runs asynchronous jobs at a verified −50% on both input and output across all models we track — flip the Batch toggle in the calculator to model it.

Context window

Claude Fable 5, Claude Opus 4.8 and Claude Sonnet 4.6 run a verified 1M-token context window; Claude Haiku 4.5 is 200k tokens.

When Claude is worth it

Use caseVerdict
Long documents or large context within a single requestMost Claude tiers carry a large context window
Repeated system prompts or shared context across callsPrompt caching applies across the lineup
You need the cheapest tier with maximum contextCheck the budget tier's smaller window first
Is Claude the right price for your workload?
The calculator puts these four models next to the other 24 we track — at your volume, token mix and cache share.
Open calculator

Frequently asked questions

Caching and a batch (async) discount are available across the lineup in our data. The discounted cached-input rate per tier is shown in the per-model table above.
No — the frontier and mid tiers carry the larger context window, while the budget tier is smaller. The exact window per model is listed in the context-window note on this page.
All 28 models → OpenAI pricing → Gemini pricing → DeepSeek pricing → Grok pricing → Mistral pricing → Claude alternatives → Cheapest LLM API → Price changelog →