Question 1

What does a runaway agent loop look like on a bill?

Accepted Answer

A sharp single-day spike, often from one developer's machine. An agent gets stuck retrying the same failing approach — edit, test, fail, edit — and because context accumulates every turn, the late turns of a stuck session cost many times the early ones. One overnight loop on a frontier model can burn a four-figure sum. Hard turn caps and same-action loop detection are the fix.

Question 2

How do cache misses inflate a bill without anything breaking?

Accepted Answer

Prompt caching fails silently: if the prefix is not byte-identical between calls, or calls arrive further apart than the 5-minute TTL, every call pays full input rates instead of the ~90%-discounted cache-read rate. Nothing errors, latency barely changes, and the bill is simply 3-5× what it should be for agentic workloads. The tell is a high bill with normal usage patterns.

Question 3

What is a retry storm and why is it expensive?

Accepted Answer

An integration that retries failures without backoff or caps. During a provider incident, every client retries simultaneously, gets rate-limited, and retries again — multiplying traffic exactly when the provider is degraded. The expensive subtlety: requests that fail after generating partial output still bill those tokens. Bounded retries (3 attempts, exponential backoff with jitter) cap the damage at 4× a single call.

Question 4

How would alerting have caught this before the invoice?

Accepted Answer

Every cause this quiz diagnoses leaves a real-time signature: a session exceeding 3× the median cost, a cache hit rate dropping below baseline, a burst of 429-then-retry patterns, a developer's daily spend doubling. FORG watches those signals per session and per developer and alerts when they fire — so you find out about a runaway loop in minutes, not when finance forwards the invoice.

AI Bill Diagnostic

How it works

Frequently asked questions

What does a runaway agent loop look like on a bill?

How do cache misses inflate a bill without anything breaking?

What is a retry storm and why is it expensive?

How would alerting have caught this before the invoice?

Related tools

Agent Loop Cost Simulator

Chat Log Analyzer

Agent Session Cost Estimator

Model Downgrade Advisor