Question 1

Why do I need jitter at all?

Accepted Answer

Without jitter, every client that failed at the same moment retries at exactly the same moment — base delay times multiplier is deterministic — so a transient outage turns into synchronized retry waves that keep knocking the service over, the classic thundering herd. Full jitter picks a uniform random delay between zero and the computed backoff, which spreads the herd across the whole window. AWS's published analysis found full jitter reduces total work and contention dramatically compared to no jitter.

Question 2

What is the difference between full and equal jitter?

Accepted Answer

Full jitter randomizes the entire delay: sleep a uniform random amount between 0 and the exponential backoff value, which gives maximum spread but means some retries fire almost immediately. Equal jitter keeps half the backoff as a guaranteed floor and randomizes the other half — sleep backoff/2 plus a random amount up to backoff/2 — trading a little spread for a guaranteed minimum pause. Full jitter is the usual default; equal jitter suits services that genuinely need breathing room after every failure.

Question 3

Should I honor the Retry-After header?

Accepted Answer

Yes, always, when the provider sends one. A 429 or 529 with Retry-After is the server telling you exactly when capacity returns — retrying earlier is guaranteed wasted spend and may extend your rate limiting, while your computed backoff is just a guess. The correct policy is max(your backoff, Retry-After). The designer's toggle reflects this: when honored, the header overrides shorter computed delays.

Question 4

How many retries should an LLM call get?

Accepted Answer

Fewer than you think. Each retried call re-sends the full prompt, so retries on a 50k-token agent context are expensive — the cost panel in this tool makes that concrete. Three to five attempts with a max delay cap around 30-60 seconds covers virtually all transient 429/529/timeout blips; failures beyond that usually indicate a real outage where retrying burns money without succeeding. Past the cap, fail fast and surface the error to something that can make a smarter decision.

Question 5

What do the exported snippets contain?

Accepted Answer

A self-contained retry function in TypeScript or Python implementing exactly the policy you configured: base delay, multiplier, max retries, cap, your chosen jitter mode and Retry-After handling. There are no dependencies beyond the standard library — no axios-retry or tenacity required — so you can paste it into any codebase and adapt the error-classification predicate to your SDK's exception types.

Retry Backoff Designer

How it works

Frequently asked questions

Why do I need jitter at all?

What is the difference between full and equal jitter?

Should I honor the Retry-After header?

How many retries should an LLM call get?

What do the exported snippets contain?

Related tools

Rate Limit Planner

AI SDK Error Decoder

Agent Loop Cost Simulator

SLA Uptime Calculator