Question 1

What is context rot?

Accepted Answer

The reliable degradation of a model's ability to find and use information as its context window fills. A fact a model retrieves perfectly from a 5k-token context gets missed, garbled or half-remembered when the same fact sits inside 150k tokens of other material. It is not a bug in any one model — every published long-context evaluation shows the effect to some degree, varying mainly in how late and how steep the decline is.

Question 2

What does 'lost in the middle' mean?

Accepted Answer

Liu et al. (2023) showed that retrieval accuracy depends on where information sits: models recall facts placed at the very beginning or very end of context far better than facts buried in the middle. The U-shaped curve has been replicated broadly. Practically, this is why agent harnesses pin important instructions at the start and recent conversation at the end — and why the needle-position selector in this tool changes the answer so much.

Question 3

How do I mitigate context rot?

Accepted Answer

Keep contexts lean rather than relying on the advertised window. The standard toolkit: compaction (summarize old history instead of carrying it verbatim), retrieval (store reference material outside the context and load only relevant chunks per query), positional discipline (critical instructions first, fresh data last), and session resets at task boundaries. The Context Window Visualizer helps budget what stays in.

Question 4

Do all models rot at the same rate?

Accepted Answer

No — and the differences are big enough to matter for routing decisions. Frontier models hold accuracy deep into their windows; smaller and cheaper tiers degrade earlier and steeper. Models engineered specifically for long context can outperform their general capability tier on retrieval tasks. The class overlay in this tool shows the spread at any fill level, but always check current published benchmarks for the specific models you run.

Question 5

How literally should I take these numbers?

Accepted Answer

As shapes, not measurements. The curves are smoothed class-level composites derived from published needle-in-a-haystack style research (Lost in the Middle, RULER, community NIAH runs) — explicitly labeled illustrative on the tool. Real accuracy depends on your task: single-fact retrieval is the easy case, and multi-hop reasoning over long context degrades sooner and harder than any of these curves show.

Context Rot Simulator

How it works

Frequently asked questions

What is context rot?

What does 'lost in the middle' mean?

How do I mitigate context rot?

Do all models rot at the same rate?

How literally should I take these numbers?

Related tools

Context Window Visualizer

Context Window Comparison

Agent Session Cost Estimator

Model Capability Picker