Markdown Token Heatmap
Render your markdown with a heatmap showing exactly which blocks burn the most tokens.
total tokens — loading tokenizer… across 10 blocks. As input on Claude Sonnet 4.5: $0.0007 per send.
Heaviest blocks
| Block | Tokens | Share |
|---|---|---|
| Table | Severity | Threshold | Page? | Own… | 89 | 37% |
| List - Stripe webhook timeouts: check the… | 51 | 21% |
| Code block ```bash git tag release-$(date +%Y%m… | 30 | 12% |
| Paragraph Deploys go through CI on merge to ma… | 21 | 9% |
| Paragraph Quick reference for the payments ser… | 19 | 8% |
| Paragraph Contact #payments-oncall for anythin… | 15 | 6% |
| Heading # Service Runbook… | 5 | 2% |
| Heading ## Common failures… | 5 | 2% |
Heat overlay — stronger background = more tokens
# Service Runbook
Heading · 5 tok · 2.1%
Quick reference for the payments service. Keep this under two pages.
Paragraph · 19 tok · 7.9%
## Deploy
Heading · 3 tok · 1.2%
Deploys go through CI on merge to main. Manual deploys are emergencies only.
Paragraph · 21 tok · 8.7%
```bash git tag release-$(date +%Y%m%d) git push origin --tags ./scripts/deploy.sh --env prod --confirm ```
Code block · 30 tok · 12.4%
## Error budget
Heading · 4 tok · 1.7%
| Severity | Threshold | Page? | Owner | Escalation | |----------|-----------|-------|----------|-------------------| | SEV1 | any | yes | on-call | EM within 15 min | | SEV2 | > 5/hour | yes | on-call | EM within 1 hour | | SEV3 | > 50/day | no | triage | weekly review |
Table · 89 tok · 36.8%
## Common failures
Heading · 5 tok · 2.1%
- Stripe webhook timeouts: check the dead-letter queue first - DB connection pool exhaustion: usually a leaked transaction - Cache stampede after deploy: warm the cache before cutover
List · 51 tok · 21.1%
Contact #payments-oncall for anything not covered here.
Paragraph · 15 tok · 6.2%
How it works
Documentation that feeds language models has a cost structure invisible in any editor: two screens of markdown can differ by thousands of tokens depending on how much of it is tables, code fences and decoration. This tool splits your markdown into blocks — headings, paragraphs, lists, tables, code — tokenizes each one locally in your browser, and renders the document back with a heat overlay plus a ranking table showing exactly which blocks carry the weight.
The mechanics: a lightweight block parser walks your text line by line, grouping fenced code, contiguous table rows, list runs and paragraphs. Each block is counted with the real o200k_base encoding (lazily loaded, with a stated characters ÷ 3.6 fallback if it cannot load), and its share of the document total drives the overlay intensity. The cost strip prices one full send of the document as model input at current rates — the number that matters for anything re-sent per call, like a CLAUDE.md or system preamble.
The pattern the heatmap exposes again and again: tables dominate. Pipes, alignment dashes and padding spaces are all tokens, so a reference table can hold forty percent of a document's tokens while carrying ten percent of its information. Code blocks run second — denser per character than prose, and frequently padded with output dumps nobody trimmed. Headings and short paragraphs are nearly free, and they are also the structure models navigate by, so cutting them is false economy.
Use the ranking as a work order: rewrite the top block, re-paste, watch the total drop. The Prompt Compressor automates the mechanical cleanups (whitespace, decoration, duplicate lines) and prices the saving at your call volume. And since every agent session re-reads these files, the savings recur on every call — FORG shows you the live token flow per session, so you can verify the trim shows up in the bill, not just in this tool. Documents drift back toward bloat, so re-run the heatmap after each significant edit cycle.
Frequently asked questions
Why are markdown tables so token-expensive?
Every pipe, dash and padding space in a table tokenizes separately, and the separator row under the header is pure syntax with zero information. A modest five-column table routinely costs three to four times the tokens of the same facts written as a compact list. If a doc is destined for an LLM context window, converting decorative tables to key-value lists is usually the single biggest saving the heatmap will reveal.
How much do code blocks cost?
Code runs about 3 characters per token versus 4 for prose, so a code block costs roughly a third more per character than the paragraphs around it. Indentation multiplies this — each line's leading whitespace tokenizes — and long command output pasted into fenced blocks is a classic hidden cost. Keep examples minimal: one representative invocation beats five variations.
Do headings add meaningful overhead?
Individually no — a heading is a handful of tokens. But headings earn their cost: they give models structural anchors that measurably improve instruction-following and retrieval within long documents. The optimization rule is to cut decoration (bold, italics, horizontal rules) and keep structure (headings, lists). The heatmap shows headings as cool blocks for exactly this reason.
How should I optimize docs that get fed to LLMs?
Attack the heatmap in rank order. Convert wide tables to lists, trim code blocks to the minimum runnable example, delete sections the model never needs (changelogs, badges, contributor lists), and collapse repeated boilerplate. A CLAUDE.md or README that ships with every agent session is re-billed per call, so a one-time 40% trim compounds into real money — the cost strip above prices each send.
Are the token counts exact?
Counts use the o200k_base tokenizer running locally in your browser — exact for current OpenAI models and a close proxy for Claude, whose tokenizer is not public. If the tokenizer fails to load, the tool falls back to a documented characters ÷ 3.6 estimate and says so. Your markdown is never uploaded; the entire analysis is client-side.
Built by FORG — AI cost observability for agentic coding. Free tools, no signup, nothing leaves your browser.
Learn about FORG