HTTP 429 Too Many Requests

You exceeded a rate limit — requests per minute, tokens per minute, or concurrent connections.

4xx · Client error✓ retryable with backoff

In AI APIs specifically

THE most common AI-API error at scale. Anthropic returns retry-after seconds in the Retry-After header and separates RPM/ITPM/OTPM limits. OpenAI returns Retry-After plus x-ratelimit-remaining-* headers for both requests and tokens. Google returns RESOURCE_EXHAUSTED quota errors. Always honor Retry-After rather than guessing.

Fix checklist

Honor the Retry-After header exactly — don't invent your own delay when one is given.
Implement exponential backoff with jitter for the no-header case.
Spread bursty workloads with a client-side token bucket.
Request a tier upgrade if you hit limits at steady state.
Cache repeated prompt prefixes to cut token throughput.

Retry handler (TypeScript)

async function fetchWithRetry(url: string, init: RequestInit, maxRetries = 5) {
  for (let attempt = 0; ; attempt++) {
    const res = await fetch(url, init);
    // 429 is retryable — back off and try again.
    if (res.status !== 429 || attempt >= maxRetries) return res;
    const retryAfter = Number(res.headers.get("retry-after"));
    const delay = Number.isFinite(retryAfter) && retryAfter > 0
      ? retryAfter * 1000
      : Math.min(60_000, 1000 * 2 ** attempt) * (0.5 + Math.random()); // expo backoff + jitter
    await new Promise((r) => setTimeout(r, delay));
  }
}

Spec: RFC reference

Related status codes

400 Bad Request

The server cannot process the request due to a client error — malformed JSON, invalid parameters, or schema violations..

401 Unauthorized

Authentication is missing or invalid for this request..

402 Payment Required

The request requires payment — billing is not set up or a balance is exhausted..

403 Forbidden

The server understood the request but refuses to authorize it — valid credentials, insufficient permission..

← All HTTP status codes