Question 1

Why are markdown tables so token-expensive?

Accepted Answer

Every pipe, dash and padding space in a table tokenizes separately, and the separator row under the header is pure syntax with zero information. A modest five-column table routinely costs three to four times the tokens of the same facts written as a compact list. If a doc is destined for an LLM context window, converting decorative tables to key-value lists is usually the single biggest saving the heatmap will reveal.

Question 2

How much do code blocks cost?

Accepted Answer

Code runs about 3 characters per token versus 4 for prose, so a code block costs roughly a third more per character than the paragraphs around it. Indentation multiplies this — each line's leading whitespace tokenizes — and long command output pasted into fenced blocks is a classic hidden cost. Keep examples minimal: one representative invocation beats five variations.

Question 3

Do headings add meaningful overhead?

Accepted Answer

Individually no — a heading is a handful of tokens. But headings earn their cost: they give models structural anchors that measurably improve instruction-following and retrieval within long documents. The optimization rule is to cut decoration (bold, italics, horizontal rules) and keep structure (headings, lists). The heatmap shows headings as cool blocks for exactly this reason.

Question 4

How should I optimize docs that get fed to LLMs?

Accepted Answer

Attack the heatmap in rank order. Convert wide tables to lists, trim code blocks to the minimum runnable example, delete sections the model never needs (changelogs, badges, contributor lists), and collapse repeated boilerplate. A CLAUDE.md or README that ships with every agent session is re-billed per call, so a one-time 40% trim compounds into real money — the cost strip above prices each send.

Question 5

Are the token counts exact?

Accepted Answer

Counts use the o200k_base tokenizer running locally in your browser — exact for current OpenAI models and a close proxy for Claude, whose tokenizer is not public. If the tokenizer fails to load, the tool falls back to a documented characters ÷ 3.6 estimate and says so. Your markdown is never uploaded; the entire analysis is client-side.

Block	Tokens	Share
Table \| Severity \| Threshold \| Page? \| Own…	89	37%
List - Stripe webhook timeouts: check the…	51	21%
Code block ```bash git tag release-$(date +%Y%m…	30	12%
Paragraph Deploys go through CI on merge to ma…	21	9%
Paragraph Quick reference for the payments ser…	19	8%
Paragraph Contact #payments-oncall for anythin…	15	6%
Heading # Service Runbook…	5	2%
Heading ## Common failures…	5	2%

Markdown Token Heatmap

How it works

Frequently asked questions

Why are markdown tables so token-expensive?

How much do code blocks cost?

Do headings add meaningful overhead?

How should I optimize docs that get fed to LLMs?

Are the token counts exact?

Related tools

Prompt Compressor

Token Counter

Tokens to Words Converter

CLAUDE.md Generator