Code Token Analyzer
The same function tokenized across Python, JavaScript, Go and Rust — compared.
The function — quicksort, idiomatic per language
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
mid = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + mid + quicksort(right)Python shown; all six implementations sort identically and follow each language's normal formatting conventions (gofmt tabs, rustfmt, 4-space Python).
Loading tokenizer… counts appear in a moment.
| Python | 89 | 3.00 | $0.0003 |
| JavaScript | 103 | 2.99 | $0.0003 |
| Go | 121 | 3.00 | $0.0004 |
| TypeScript | 122 | 3.00 | $0.0004 |
| Rust | 143 | 3.01 | $0.0004 |
| Java | 158 | 3.00 | $0.0005 |
Exact o200k_base counts, tokenized locally. Cost prices each snippet as fresh input on Claude Sonnet 4.5. Click any column header to sort.
How it works
When an agent reads your codebase, it pays by the token — and languages are not priced equally. The same algorithm needs different amounts of syntax in Python, JavaScript, TypeScript, Go, Rust and Java, and every brace, annotation and keyword is billed at the same rate as the logic itself. This analyzer makes the spread concrete with a controlled experiment: one function, six idiomatic implementations, one exact tokenizer.
The function is quicksort — short enough to read at a glance, real enough to exercise each language's normal ceremony: comprehensions in Python, arrow-function filters in JavaScript, explicit type annotations in TypeScript, range-and-switch in Go, iterator chains in Rust, streams and generics in Java. Each implementation follows its standard formatter, because tokenizers see formatting: gofmt tabs and rustfmt line breaks are part of the measurement, exactly as they would be part of your prompt.
Every snippet is tokenized with the o200k_base encoding — the one current OpenAI models bill against — running locally via js-tiktoken; nothing is uploaded. The table reports tokens, characters per token (code density), and what each snippet costs as fresh input on your selected model at verified rates. Click any column header to re-sort.
What the numbers teach: verbosity rankings are roughly what intuition predicts (Python lean, Java heavy), but the magnitude surprises people — the spread between leanest and heaviest routinely exceeds fifty percent for identical logic. Density hovers near three characters per token for all six, well below prose's four, which is why code-heavy prompts blow through budgets faster than document-heavy ones of the same character length.
Honest limits: one small function is a sample, not a corpus study, and your codebase's ratio depends on its mix of comments, identifiers and boilerplate. For measuring your actual files, paste them into the Token Counter; for what those tokens cost across a whole agent session rather than a single read, the Agent Session Cost Estimator models the loop end to end.
Frequently asked questions
Why do programming languages differ in token count for the same logic?
Syntax ceremony. Braces, semicolons, type annotations, visibility keywords and import boilerplate all tokenize, and languages distribute them very differently. Python's significant whitespace and comprehensions express the algorithm in the fewest symbols; Java's stream pipelines and generic type parameters spend tokens on machinery the algorithm itself doesn't need.
Does this mean I should feed agents Python instead of Java?
No — you feed agents the language your codebase is written in. The practical implication is different: token budgets for code review, refactoring and codebase-question tasks scale with your language's verbosity. A Java monorepo consumes meaningfully more context window per file than the equivalent Go or Python, which affects how many files fit in a prompt and what each agent turn costs.
Why is chars-per-token lower for code than for prose?
English prose averages around four characters per token because common words map to single vocabulary entries. Code fragments harder: operators, brackets, short identifiers and mixed casing produce many one-to-three character tokens. Most of the snippets here land near three characters per token, and heavily symbolic code can go lower still.
Are these counts exact, and what about Claude?
The counts are exact o200k_base token counts — the encoding current OpenAI models bill against — computed by js-tiktoken entirely in your browser. Anthropic has not published Claude's tokenizer, so Claude-priced rows apply Claude rates to the o200k counts. Cross-language rankings are stable across tokenizers even where absolute counts drift a few percent.
Do formatting conventions change the results?
Yes, measurably. Each snippet follows its language's standard formatter — gofmt's tabs, rustfmt, 4-space Python, 2-space JavaScript — because that is what real code looks like when an agent reads it. Tabs tokenize differently from runs of spaces, and minifying code would shrink counts at the price of being unrepresentative. We compare code as it actually ships.
Built by FORG — AI cost observability for agentic coding. Free tools, no signup, nothing leaves your browser.
Learn about FORG