Question 1

What actually fills a context window?

Accepted Answer

Everything the model reads on each call: the system prompt, tool and function definitions, the full conversation history, file contents pulled into context, and retrieved documents. In agentic coding sessions the biggest line items are usually file contents and accumulated history — tool results from earlier turns get re-sent every single turn unless something compacts them away.

Question 2

What happens when I overflow the window?

Accepted Answer

It depends on the client. Raw API calls fail with a validation error when input exceeds the window. Agent harnesses like Claude Code instead compact: they summarize or drop older history to make room, which silently loses detail. Either way, performance degrades well before the hard limit — retrieval accuracy drops as windows fill, which is exactly what our Context Rot Simulator charts.

Question 3

How do I find the token size of each component?

Accepted Answer

Paste each piece — system prompt, a representative file, your tool JSON — into our Token Counter, which runs the real o200k tokenizer locally in your browser. As rules of thumb: English prose runs about 4 characters per token, source code about 3, and JSON tool schemas are dense because every brace, quote and key tokenizes separately.

Question 4

What are the main compaction strategies?

Accepted Answer

Four cover most cases: summarize old turns into a short digest, truncate oversized tool results before they enter history, move reference material out of context into retrieval so only relevant chunks are loaded, and reset the session at natural task boundaries. Each trades recall for headroom; the right mix depends on whether your sessions die from history growth or from file bloat.

Question 5

Does a bigger window solve this?

Accepted Answer

Partially, and at a price. A 1M-token window delays overflow but every input token is billed on every turn, so filling a huge window makes each call proportionally expensive. Long-context accuracy also degrades before the limit. Budgeting the window deliberately is usually cheaper and more reliable than buying a bigger one and filling it.

Context Window Visualizer

How it works

Frequently asked questions

What actually fills a context window?

What happens when I overflow the window?

How do I find the token size of each component?

What are the main compaction strategies?

Does a bigger window solve this?

Related tools

Context Window Comparison

Context Rot Simulator

Token Counter

Agent Session Cost Estimator