LLM / providers / grok

xAI Grok API Pricing

xAI's Grok models offer a flagship tier with a large context window and a low output-to-input price ratio, plus a flat cached-input rate, which can suit output-heavy workloads.

Prices verified June 2026 · changes logged in the changelog

Heads up: xAI publishes a Batch API but does not publish the exact discount multiplier, so we hold no batch figure and the estimate does not assume one — your real async cost may be lower than shown. The cached-input rate is also flat across tiers rather than scaling with the standard rate.

Model	$ input /1M	$ output /1M	$ cached /1M	Batch	≈ $/mo *
Grok 4.3FRONTIER	$1.25	$2.50	$0.20	—	$178
Grok Build 0.1MID	$1	$2	$0.20	—	$148

* Example workload — chatbot, 100k requests/mo, 2,000 input / 300 output tokens per request, 70% of input cached. Computed by the same engine as the calculator. Batch: no verified batch discount published for Grok at our last revision — we only list discounts we've confirmed.

Prompt caching

Cache pricing differs per model: Grok 4.3 at $0.20/1M (16% of input); Grok Build 0.1 at $0.20/1M (20% of input). The calculator models this with your cache share.

Batch / async

No verified batch discount published at our last revision — we only list discounts we've confirmed, so the Batch toggle in the calculator leaves Grok prices unchanged.

Context window

Grok 4.3 runs a verified 1M-token context window; Grok Build 0.1 is 256k tokens.

When Grok is worth it

Use case	Verdict
Output-heavy workloads	Grok's output rate is relatively low versus input
Large context within a single request	The flagship tier carries a large window
You need a confirmed batch discount up front	The exact batch multiplier is unpublished

Is Grok the right price for your workload?

The calculator puts these two models next to the other 26 we track — at your volume, token mix and cache share.

Open calculator

Frequently asked questions

Does xAI offer a batch discount?

xAI documents a Batch API, but the exact discount is not published in a form we can verify, so we hold no batch figure and the estimate does not apply one. Your async cost may be lower than the on-demand figure shown.

How does Grok's caching work?

Our data lists a flat cached-input rate across the Grok tiers rather than one that scales with each model's standard rate. The exact figure is in the per-model table above.

All 28 models → OpenAI pricing → Anthropic pricing → Gemini pricing → DeepSeek pricing → Mistral pricing → Grok alternatives → Cheapest LLM API → Price changelog →