LLM / providers / openai

OpenAI API Pricing

OpenAI offers a broad GPT-5 lineup spanning frontier, mid and budget tiers, all with prompt caching and batch discounts, plus a high-end Pro variant for the most demanding reasoning tasks.

Prices verified June 2026 · changes logged in the changelog

Heads up: The top-end GPT-5.5 Pro tier is listed without a cached-input rate, so prompt caching can't be assumed to lower its cost — price it on full input. Both flagship tiers also carry a separate long-context rate for very large prompts; check which rate applies to your prompt size before estimating.

Model	$ input /1M	$ output /1M	$ cached /1M	Batch	≈ $/mo *
GPT-5.5 ProFRONTIER	$30	$180	—	−50%	$11,400
GPT-5.5FRONTIER	$5	$30	$0.50	−50%	$1,270
GPT-5.4FRONTIER	$2.50	$15	$0.25	−50%	$635
GPT-5.4 miniMID	$0.75	$4.50	$0.075	−50%	$190.5
GPT-5.4 nanoBUDGET	$0.20	$1.25	$0.02	−50%	$52.3

* Example workload — chatbot, 100k requests/mo, 2,000 input / 300 output tokens per request, 70% of input cached. Computed by the same engine as the calculator. Batch: the −50% is OpenAI's verified Batch API discount; the ≈ $/mo column is computed without it.

Prompt caching

Cached input is billed at 10% of the input rate on every GPT model with a published cache price; GPT-5.5 Pro doesn't list one, so the calculator charges it the full input rate. A major lever for chatbots and agents where most of the prompt repeats — the calculator models this with your cache share.

Batch / async

The Batch API runs asynchronous jobs at a verified −50% on both input and output across all models we track — flip the Batch toggle in the calculator to model it.

Context window

GPT-5.5, GPT-5.5 Pro and GPT-5.4 run a verified 1M-token context window; GPT-5.4 mini and GPT-5.4 nano are 400k tokens. GPT-5.5, GPT-5.5 Pro and GPT-5.4 bill higher rates on long-context prompts — this table and the calculator use standard rates.

When GPT is worth it

Use case	Verdict
You want one vendor covering frontier down to budget models with a consistent API	OpenAI's lineup fits
Workload is asynchronous and tolerant of delay	Batch tier lowers the per-token cost
Hard requirement for the lowest possible token price	Compare against budget-tier and open-weight providers in the table

Is GPT the right price for your workload?

The calculator puts these five models next to the other 23 we track — at your volume, token mix and cache share.

Open calculator

Frequently asked questions

Does OpenAI support prompt caching and batch processing?

Most GPT-5 tiers offer both a discounted cached-input rate and a batch (async) discount. The exception in our data is the top Pro tier, which is listed without a cached rate — see the per-model table above for which models include caching.

Why do some OpenAI models show a separate long-context price?

The flagship tiers apply a higher rate to very large prompts (above a token threshold) than to short prompts. The table shows the standard short-context rate; if your prompts are large, model the long-context rate noted on the page instead.

All 28 models → Anthropic pricing → Gemini pricing → DeepSeek pricing → Grok pricing → Mistral pricing → OpenAI alternatives → Cheapest LLM API → Price changelog →