LLM / providers / openai

OpenAI API Pricing

OpenAI offers a broad GPT-5 lineup spanning frontier, mid and budget tiers, all with prompt caching and batch discounts, plus a high-end Pro variant for the most demanding reasoning tasks.

Prices verified June 2026 · changes logged in the changelog
Heads up: The top-end GPT-5.5 Pro tier is listed without a cached-input rate, so prompt caching can't be assumed to lower its cost — price it on full input. Both flagship tiers also carry a separate long-context rate for very large prompts; check which rate applies to your prompt size before estimating.
Model$ input /1M$ output /1M$ cached /1MBatch≈ $/mo *
GPT-5.5 ProFRONTIER $30$180−50%$11,400
GPT-5.5FRONTIER $5$30$0.50−50%$1,270
GPT-5.4FRONTIER $2.50$15$0.25−50%$635
GPT-5.4 miniMID $0.75$4.50$0.075−50%$190.5
GPT-5.4 nanoBUDGET $0.20$1.25$0.02−50%$52.3

* Example workload — chatbot, 100k requests/mo, 2,000 input / 300 output tokens per request, 70% of input cached. Computed by the same engine as the calculator. Batch: the −50% is OpenAI's verified Batch API discount; the ≈ $/mo column is computed without it.

Prompt caching

Cached input is billed at 10% of the input rate on every GPT model with a published cache price; GPT-5.5 Pro doesn't list one, so the calculator charges it the full input rate. A major lever for chatbots and agents where most of the prompt repeats — the calculator models this with your cache share.

Batch / async

The Batch API runs asynchronous jobs at a verified −50% on both input and output across all models we track — flip the Batch toggle in the calculator to model it.

Context window

GPT-5.5, GPT-5.5 Pro and GPT-5.4 run a verified 1M-token context window; GPT-5.4 mini and GPT-5.4 nano are 400k tokens. GPT-5.5, GPT-5.5 Pro and GPT-5.4 bill higher rates on long-context prompts — this table and the calculator use standard rates.

When GPT is worth it

Use caseVerdict
You want one vendor covering frontier down to budget models with a consistent APIOpenAI's lineup fits
Workload is asynchronous and tolerant of delayBatch tier lowers the per-token cost
Hard requirement for the lowest possible token priceCompare against budget-tier and open-weight providers in the table
Is GPT the right price for your workload?
The calculator puts these five models next to the other 23 we track — at your volume, token mix and cache share.
Open calculator

Frequently asked questions

Most GPT-5 tiers offer both a discounted cached-input rate and a batch (async) discount. The exception in our data is the top Pro tier, which is listed without a cached rate — see the per-model table above for which models include caching.
The flagship tiers apply a higher rate to very large prompts (above a token threshold) than to short prompts. The table shows the standard short-context rate; if your prompts are large, model the long-context rate noted on the page instead.
All 28 models → Anthropic pricing → Gemini pricing → DeepSeek pricing → Grok pricing → Mistral pricing → OpenAI alternatives → Cheapest LLM API → Price changelog →