Skip to main content

Token Cost Calculator

Paste a prompt or set token counts and get the exact cost on every major model.

100% client-side⛁ prices verified 2026-06-11⌁ zero network calls
tokens
tokens
100
0%
$0.06

per call on Claude Sonnet 4.5$6.00/day, $182.64/month at 100 calls/day.

Input cost share
50%
Output cost share
50%
Monthly tokens
36.5M
Cost per 1k calls
$60.00

input ■ vs output cost split

18
models priced, 4 vendors
2026-06-11
prices verified against vendor pages
90d
price staleness tripwire in CI
0
network requests per keystroke

How it works

This calculator prices a single LLM call and projects it to daily and monthly spend. Set your input and output token counts, pick a model, and the cost updates instantly — everything runs in your browser with no signup.

The formula is simple: every provider bills per million tokens with separate input and output rates, so a call costs (input × rate_in + output × rate_out) ÷ 1M. What surprises most teams is the split: because output rates run 4-8× input rates, a chatty model that writes long answers can cost more than one reading a huge context. The input/output share bar below the result makes that visible at a glance.

The cache-hit slider models prompt caching, which Anthropic and OpenAI offer at roughly 90% off the input rate for repeated prefixes. Agentic coding tools like Claude Code resend the same system prompt and project context on every turn, so real-world cache hit rates of 60-90% are common — and they change the economics dramatically. If your monthly projection looks high, caching is usually the first lever to pull, followed by routing routine tasks to a cheaper model.

Monthly figures use a 30.44-day month (365.25 ÷ 12). Prices were last verified on 2026-06-11 against vendor pricing pages; the assumptions are shown rather than hidden, so you can sanity-check every number. For measured per-session costs from your actual agent traffic — including retries and cache misses this calculator can only estimate — that is what FORG does automatically.

A note on what this deliberately leaves out: rate-limit retries, refused requests that still bill partial input, and tier-based volume discounts all move real invoices away from clean estimates. Treat the numbers here as a floor, not a ceiling. The share link preserves your exact inputs, so you can hand a teammate the same scenario you are looking at — useful when arguing for a model switch or a caching project in a planning doc.

Frequently asked questions

How is token cost calculated?

Providers price per million tokens, with separate input and output rates. Cost = (input tokens × input rate + output tokens × output rate) ÷ 1,000,000. Output tokens typically cost 4-5× more than input tokens, so long responses dominate your bill.

Why do output tokens cost more than input tokens?

Generating tokens is sequential and compute-heavy — the model produces one token at a time, while input tokens are processed in parallel. Providers price that asymmetry in: on Claude Sonnet, output is 5× the input rate.

What is prompt caching and how does it change cost?

Prompt caching stores a repeated prompt prefix (like a system prompt) on the provider's side. Cached input tokens are billed at a tenth of the normal input rate on Anthropic models. The cache-hit slider models what fraction of your input is served from cache.

How accurate are these numbers?

Prices come from public vendor pricing pages (last verified 2026-06-11) and the math is exact for the rates shown. Your real bill also depends on retries, failed calls and tier discounts — FORG measures those from your actual sessions.

How many tokens is my prompt?

A rough rule: 1 token ≈ 4 characters or ≈ 0.75 English words. Code is denser, roughly 3 characters per token. Use our free Token Counter tool to measure exact counts for your text.

FORG tracks this automatically across every agent session — live cost attribution, budgets, and alerts.

Start tracking with FORG