HTTP 429 Too Many Requests
You exceeded a rate limit — requests per minute, tokens per minute, or concurrent connections.
In AI APIs specifically
THE most common AI-API error at scale. Anthropic returns retry-after seconds in the Retry-After header and separates RPM/ITPM/OTPM limits. OpenAI returns Retry-After plus x-ratelimit-remaining-* headers for both requests and tokens. Google returns RESOURCE_EXHAUSTED quota errors. Always honor Retry-After rather than guessing.
Fix checklist
- Honor the Retry-After header exactly — don't invent your own delay when one is given.
- Implement exponential backoff with jitter for the no-header case.
- Spread bursty workloads with a client-side token bucket.
- Request a tier upgrade if you hit limits at steady state.
- Cache repeated prompt prefixes to cut token throughput.
Retry handler (TypeScript)
async function fetchWithRetry(url: string, init: RequestInit, maxRetries = 5) {
for (let attempt = 0; ; attempt++) {
const res = await fetch(url, init);
// 429 is retryable — back off and try again.
if (res.status !== 429 || attempt >= maxRetries) return res;
const retryAfter = Number(res.headers.get("retry-after"));
const delay = Number.isFinite(retryAfter) && retryAfter > 0
? retryAfter * 1000
: Math.min(60_000, 1000 * 2 ** attempt) * (0.5 + Math.random()); // expo backoff + jitter
await new Promise((r) => setTimeout(r, delay));
}
}Spec: RFC reference
Related status codes
The server cannot process the request due to a client error — malformed JSON, invalid parameters, or schema violations..
401 UnauthorizedAuthentication is missing or invalid for this request..
402 Payment RequiredThe request requires payment — billing is not set up or a balance is exhausted..
403 ForbiddenThe server understood the request but refuses to authorize it — valid credentials, insufficient permission..