Anthropic's Claude lineup runs from a frontier Fable/Opus tier down to a budget Haiku tier, with prompt caching and batch discounts available across the range and a large context window on most models.
| Model | $ input /1M | $ output /1M | $ cached /1M | Batch | ≈ $/mo * |
|---|---|---|---|---|---|
| Claude Fable 5FRONTIER | $10 | $50 | $1 | −50% | $2,240 |
| Claude Opus 4.8FRONTIER | $5 | $25 | $0.50 | −50% | $1,120 |
| Claude Sonnet 4.6MID | $3 | $15 | $0.30 | −50% | $672 |
| Claude Haiku 4.5BUDGET | $1 | $5 | $0.10 | −50% | $224 |
* Example workload — chatbot, 100k requests/mo, 2,000 input / 300 output tokens per request, 70% of input cached. Computed by the same engine as the calculator. Batch: the −50% is Anthropic's verified Batch API discount; the ≈ $/mo column is computed without it.
Cached input is billed at 10% of the input rate across all Claude models we track — a major lever for chatbots and agents where most of the prompt repeats. The calculator models this with your cache share.
The Batch API runs asynchronous jobs at a verified −50% on both input and output across all models we track — flip the Batch toggle in the calculator to model it.
Claude Fable 5, Claude Opus 4.8 and Claude Sonnet 4.6 run a verified 1M-token context window; Claude Haiku 4.5 is 200k tokens.
| Use case | Verdict |
|---|---|
| Long documents or large context within a single request | Most Claude tiers carry a large context window |
| Repeated system prompts or shared context across calls | Prompt caching applies across the lineup |
| You need the cheapest tier with maximum context | Check the budget tier's smaller window first |