AI Model Pricing Comparison

Compare Claude, GPT, Gemini and DeepSeek prices — see your monthly cost at any volume.

100% client-side⛁ prices verified 2026-06-11⌁ zero network calls

Your monthly volume10M tok

Input : output ratio3 : 1

Model pricing comparison at your monthly volume
Model
DeepSeek V4 Flashcheapestdeepseek · mid	$0.14	$0.28	1M	$1.75
GPT-5.4 nanoopenai · small	$0.2	$1.25	400k	$4.63
DeepSeek V4 Prodeepseek · frontier	$0.435	$0.87	1M	$5.44
Gemini 3.1 Flash-Litegoogle · small	$0.25	$1.5	1M	$5.63
Gemini 3 Flashgoogle · mid	$0.5	$3	1M	$11.25
GPT-5.4 miniopenai · mid	$0.75	$4.5	400k	$16.88
Claude Haiku 4.5anthropic · mid	$1	$5	200k	$20.00
Gemini 3.5 Flashgoogle · frontier	$1.5	$9	1M	$33.75
Gemini 2.5 Progoogle · frontier	$1.25	$10	1M	$34.38
Gemini 3.1 Progoogle · frontier	$2	$12	1M	$45.00
GPT-5.3 Codexopenai · mid	$1.75	$14	400k	$48.13
GPT-5.4openai · frontier	$2.5	$15	400k	$56.25
Claude Sonnet 4.6anthropic · frontier	$3	$15	1M	$60.00
Claude Sonnet 4.5anthropic · frontier	$3	$15	200k	$60.00
Claude Opus 4.8anthropic · frontier	$5	$25	1M	$100.00
GPT-5.5openai · frontier	$5	$30	400k	$112.50
Claude Fable 5anthropic · frontier	$10	$50	1M	$200.00
GPT-5.5 Proopenai · frontier	$30	$180	400k	$675.00

Prices checked 2026-06-11 · source: vendor pricing pages · monthly cost = your volume split 3:1 input:output.

models priced, 4 vendors

2026-06-11

prices verified against vendor pages

90d

price staleness tripwire in CI

network requests per keystroke

How it works

List prices per million tokens are easy to find but hard to compare, because no workload buys exactly one million tokens with equal input and output. This table does the conversion for you: set your monthly volume and your input:output ratio, and every model is ranked by what it would actually cost you per month.

The ratio slider matters more than most people expect. Agentic coding sessions routinely send 3-10× more input than output, because every turn re-sends accumulated context and file contents. At a 10:1 ratio, a model with cheap input and expensive output can beat one with the opposite shape, even when their blended list prices look similar. Sort by any column to explore — the cheapest model at your volume is highlighted.

Prices were last verified on 2026-06-11 against the public pricing pages of Anthropic, OpenAI, Google, DeepSeek, Meta hosting partners and Mistral. Base rates only: prompt caching and batch discounts are deliberately excluded here so the comparison stays apples-to-apples — model them with the dedicated calculators linked below.

One honest caveat: price per token is not price per task. A cheaper model that needs three attempts costs more than a frontier model that succeeds once. Pair this table with the Model Capability Picker, and if you want measured per-task costs from your real traffic instead of estimates, that is exactly what FORG tracks.

If you are choosing a default model for a team, run two scenarios before deciding: your median developer's volume and your heaviest developer's volume. Power-law usage is the norm in agentic coding — the top user often spends eight to ten times the median — and the model that is affordable at the median can be painful at the tail. The share link captures your volume and ratio settings so both scenarios can live in the same discussion thread.

Frequently asked questions

Which AI model is cheapest in 2026?

For raw per-token price, small-tier models like GPT-5 nano and Gemini Flash-Lite are cheapest. Among frontier models, prices vary 4-10× — which is why the table ranks by YOUR monthly volume rather than list price alone. The cheapest model for you depends on your input:output ratio.

Why are input and output tokens priced differently?

Output generation is sequential and compute-intensive, while input processing is parallel. Every major provider therefore charges 4-8× more for output tokens. Workloads that read a lot and write a little (code review, RAG) have very different economics from chatty ones.

How often are these prices updated?

The dataset records its verification date — currently 2026-06-11 — and our build fails if it goes more than 90 days without re-verification. Each price is sourced from the vendor's public pricing page.

Do these prices include caching and batch discounts?

The table shows base rates. Prompt caching (≈90% off repeated input) and batch processing (≈50% off non-urgent jobs) can cut real bills dramatically — expand a row, or use our Prompt Caching ROI and Batch Savings calculators to model them.

FORG tracks this automatically across every agent session — live cost attribution, budgets, and alerts.

Start tracking with FORG

Related tools

Cost & Pricing

AI Model Pricing Comparison

How it works

Frequently asked questions

Which AI model is cheapest in 2026?

Why are input and output tokens priced differently?

How often are these prices updated?

Do these prices include caching and batch discounts?

Related tools

Token Cost Calculator

Agent Session Cost Estimator

Context Window Comparison

Model Downgrade Advisor