Question 1

What is a token?

Accepted Answer

Tokens are the chunks a language model actually reads — usually word fragments. 'tokenizer' might split into 'token' + 'izer'. In English prose, 1 token ≈ 4 characters or about 0.75 words. Code tokenizes denser, around 3 characters per token, because of symbols and short identifiers.

Question 2

Is my text uploaded anywhere?

Accepted Answer

No. The tokenizer (js-tiktoken) runs entirely in your browser — there is no server call, no analytics on your content, nothing stored. You can verify in your browser's network tab: typing produces zero requests.

Question 3

Why is the Claude count an estimate?

Accepted Answer

Anthropic has not published Claude's tokenizer, so no offline tool can count it exactly. We use characters ÷ 3.6, which tracks observed Claude counts within a few percent for English text. The GPT count uses the real o200k_base encoding and is exact.

Question 4

Why do GPT and Claude token counts differ for the same text?

Accepted Answer

Each model family uses a different vocabulary. The same text might be 1,000 tokens on GPT's o200k encoding and 1,080 on Claude's. The difference matters when comparing prices: a model with cheaper rates but a less efficient tokenizer can cost more per document.

Question 5

How do I reduce my token count?

Accepted Answer

Strip markdown decoration, collapse whitespace, and prefer plain prose over heavily formatted tables (tables are token-expensive). Our Prompt Compressor tool shows exactly how many tokens each cleanup saves.

Token Counter

How it works

Frequently asked questions

What is a token?

Is my text uploaded anywhere?

Why is the Claude count an estimate?

Why do GPT and Claude token counts differ for the same text?

How do I reduce my token count?

Related tools

Token Cost Calculator

Tokens to Words Converter

Context Window Visualizer

Prompt Compressor