Token Cost Calculator
Paste a prompt or set token counts and get the exact cost on every major model.
per call on Claude Sonnet 4.5 — $6.00/day, $182.64/month at 100 calls/day.
- Input cost share
- 50%
- Output cost share
- 50%
- Monthly tokens
- 36.5M
- Cost per 1k calls
- $60.00
input ■ vs output cost split
How it works
This calculator prices a single LLM call and projects it to daily and monthly spend. Set your input and output token counts, pick a model, and the cost updates instantly — everything runs in your browser with no signup.
The formula is simple: every provider bills per million tokens with separate input and output rates, so a call costs (input × rate_in + output × rate_out) ÷ 1M. What surprises most teams is the split: because output rates run 4-8× input rates, a chatty model that writes long answers can cost more than one reading a huge context. The input/output share bar below the result makes that visible at a glance.
The cache-hit slider models prompt caching, which Anthropic and OpenAI offer at roughly 90% off the input rate for repeated prefixes. Agentic coding tools like Claude Code resend the same system prompt and project context on every turn, so real-world cache hit rates of 60-90% are common — and they change the economics dramatically. If your monthly projection looks high, caching is usually the first lever to pull, followed by routing routine tasks to a cheaper model.
Monthly figures use a 30.44-day month (365.25 ÷ 12). Prices were last verified on 2026-06-11 against vendor pricing pages; the assumptions are shown rather than hidden, so you can sanity-check every number. For measured per-session costs from your actual agent traffic — including retries and cache misses this calculator can only estimate — that is what FORG does automatically.
A note on what this deliberately leaves out: rate-limit retries, refused requests that still bill partial input, and tier-based volume discounts all move real invoices away from clean estimates. Treat the numbers here as a floor, not a ceiling. The share link preserves your exact inputs, so you can hand a teammate the same scenario you are looking at — useful when arguing for a model switch or a caching project in a planning doc.
Frequently asked questions
How is token cost calculated?
Providers price per million tokens, with separate input and output rates. Cost = (input tokens × input rate + output tokens × output rate) ÷ 1,000,000. Output tokens typically cost 4-5× more than input tokens, so long responses dominate your bill.
Why do output tokens cost more than input tokens?
Generating tokens is sequential and compute-heavy — the model produces one token at a time, while input tokens are processed in parallel. Providers price that asymmetry in: on Claude Sonnet, output is 5× the input rate.
What is prompt caching and how does it change cost?
Prompt caching stores a repeated prompt prefix (like a system prompt) on the provider's side. Cached input tokens are billed at a tenth of the normal input rate on Anthropic models. The cache-hit slider models what fraction of your input is served from cache.
How accurate are these numbers?
Prices come from public vendor pricing pages (last verified 2026-06-11) and the math is exact for the rates shown. Your real bill also depends on retries, failed calls and tier discounts — FORG measures those from your actual sessions.
How many tokens is my prompt?
A rough rule: 1 token ≈ 4 characters or ≈ 0.75 English words. Code is denser, roughly 3 characters per token. Use our free Token Counter tool to measure exact counts for your text.
FORG tracks this automatically across every agent session — live cost attribution, budgets, and alerts.
Start tracking with FORG