DeepSeek offers open-weight models at some of the lowest per-token rates in this comparison, with a large context window and a very steep cached-input discount, which suits high-volume or cost-sensitive workloads.
| Model | $ input /1M | $ output /1M | $ cached /1M | Batch | ≈ $/mo * |
|---|---|---|---|---|---|
| DeepSeek V4 ProMID | $0.435 | $0.87 | $0.003625 | — | $52.71 |
| DeepSeek V4 FlashBUDGET | $0.14 | $0.28 | $0.0028 | — | $17.19 |
* Example workload — chatbot, 100k requests/mo, 2,000 input / 300 output tokens per request, 70% of input cached. Computed by the same engine as the calculator. Batch: no verified batch discount published for DeepSeek at our last revision — we only list discounts we've confirmed.
DeepSeek publishes the cheapest cache reads we track: DeepSeek V4 Flash at $0.0028/1M (2% of input); DeepSeek V4 Pro at $0.003625/1M (0.8% of input). The calculator models this with your cache share.
No verified batch discount published at our last revision — we only list discounts we've confirmed, so the Batch toggle in the calculator leaves DeepSeek prices unchanged.
DeepSeek V4 Pro and DeepSeek V4 Flash run a verified 1M-token context window.
| Use case | Verdict |
|---|---|
| Cost-sensitive, high-volume workloads | DeepSeek's rates are among the lowest here |
| Heavy reuse of cached context | The cached-input discount is very steep |
| You need a stable multi-year price guarantee | Account for the promotional rate possibly ending |