Skip to main content

HTTP Status Code Reference

Every HTTP status code explained, with AI-API-specific notes on 429s, 529s and timeouts.

100% client-side⌁ nothing leaves your browser⎘ instant results

1xxInformational

2xxSuccess

3xxRedirection

4xxClient error

400Bad Request

The server cannot process the request due to a client error — malformed JSON, invalid parameters, or schema violations.

AI APIs: The most common AI-API 400s: max_tokens exceeding the model limit, invalid model IDs, malformed tool schemas, and context-window overflow (Anthropic returns 400 with an invalid_request_error when the prompt is too long).

401Unauthorized

Authentication is missing or invalid for this request.

AI APIs: Anthropic expects the x-api-key header; OpenAI expects Authorization: Bearer. Mixing the two conventions is the classic cause. Revoked or rotated keys also land here.

402Payment Required

The request requires payment — billing is not set up or a balance is exhausted.

AI APIs: OpenAI returns 402-style billing errors when quota is exhausted; Anthropic surfaces billing problems as 400/403 with billing messages. Check the billing console first.

403Forbidden

The server understood the request but refuses to authorize it — valid credentials, insufficient permission.

AI APIs: Common causes: requesting a model your org has no access to, geo-blocked regions, or organization policies restricting an endpoint.

404Not Found

The requested resource does not exist at this URL.

AI APIs: Usually a typo'd endpoint path, a deprecated/retired model ID, or a v1 path used against a different API version.

405Method Not Allowed

The URL exists but does not support this HTTP method.

AI APIs: Typically a GET against a POST-only inference endpoint, often caused by a redirect that downgraded the method.

408Request Timeoutretryable

The server timed out waiting for the complete request.

AI APIs: Large request bodies (long contexts, base64 images) over slow links can trip server read timeouts.

409Conflict

The request conflicts with the current state of the resource — e.g. concurrent modification.

AI APIs: Seen when cancelling an already-completed batch job or double-submitting with the same idempotency key but different bodies.

410Gone

The resource existed but has been permanently removed and no forwarding address is known.

AI APIs: Deprecated API versions and expired file/batch resources return 410 — unlike 404, the provider is telling you it will never come back.

414URI Too Long

The request URL exceeds the server's length limit.

AI APIs: Usually caused by stuffing data into query parameters that belongs in a POST body.

413Content Too Large

The request body exceeds the server's size limit.

AI APIs: Hit by oversized image payloads, giant tool results, or batch files over the per-file cap. Distinct from context-window overflow, which is usually a 400.

415Unsupported Media Type

The request body format is not supported — usually a missing or wrong Content-Type header.

AI APIs: Sending JSON without Content-Type: application/json, or uploading files with the wrong multipart encoding.

422Unprocessable Content

The request was well-formed syntactically but semantically invalid.

AI APIs: Some providers use 422 instead of 400 for parameter validation failures — the error body carries the field-level details.

429Too Many Requestsretryable

You exceeded a rate limit — requests per minute, tokens per minute, or concurrent connections.

AI APIs: THE most common AI-API error at scale. Anthropic returns retry-after seconds in the Retry-After header and separates RPM/ITPM/OTPM limits. OpenAI returns Retry-After plus x-ratelimit-remaining-* headers for both requests and tokens. Google returns RESOURCE_EXHAUSTED quota errors. Always honor Retry-After rather than guessing.

431Request Header Fields Too Large

Headers exceed the server's size limits — often a runaway cookie or oversized auth token.

451Unavailable For Legal Reasons

Access denied for legal reasons — censorship, sanctions or court orders.

AI APIs: AI providers return 451-style geo blocks in unsupported regions; a VPN exit node in a sanctioned country triggers this too.

499Client Closed Requestretryable

Non-standard (nginx): the client disconnected before the server finished responding.

AI APIs: Appears in gateway logs when an agent's HTTP timeout is shorter than model generation time — the client gave up mid-stream, but you may still be billed for generated tokens.

5xxServer error

500Internal Server Errorretryable

The server hit an unexpected condition. The fault is on the provider's side.

AI APIs: All providers occasionally 500 under load or on edge-case inputs. Anthropic returns api_error; OpenAI returns server_error. A reproducible 500 on a specific prompt is worth reporting.

501Not Implemented

The server does not support the functionality required to fulfil the request.

502Bad Gatewayretryable

A gateway or proxy received an invalid response from the upstream server.

AI APIs: Often transient infrastructure churn at the provider's edge (deploys, scaling events). With self-hosted gateways/proxies in front of AI APIs, check your own proxy first.

503Service Unavailableretryable

The server is temporarily unable to handle the request — overload or maintenance. May include Retry-After.

AI APIs: Providers return 503 during capacity crunches; Anthropic distinguishes its overload state with the dedicated 529 code instead.

504Gateway Timeoutretryable

A gateway did not receive a timely response from the upstream server.

AI APIs: Long non-streaming generations are the usual trigger — the proxy's read timeout expires before the model finishes. Streaming avoids this because bytes flow continuously.

505HTTP Version Not Supported

The server does not support the HTTP protocol version used in the request.

507Insufficient Storageretryable

The server cannot store the representation needed to complete the request.

520Web Server Returned an Unknown Errorretryable

Cloudflare-specific: the origin returned an empty, malformed or unexpected response.

AI APIs: Several AI providers sit behind Cloudflare; 520s indicate origin-side trouble that Cloudflare can't classify.

522Connection Timed Outretryable

Cloudflare-specific: the TCP connection to the origin server timed out.

AI APIs: Indicates the provider's origin is unreachable from Cloudflare's edge — a genuine provider-side outage signal.

524A Timeout Occurredretryable

Cloudflare-specific: the origin accepted the connection but did not reply within Cloudflare's (typically 100s) proxy timeout.

AI APIs: The classic long-stream killer: a non-streaming request to a slow model exceeds Cloudflare's 100-second window even though the model is still working. Streaming keeps bytes moving and resets the clock — switch any request that can take >90s to streaming.

529Overloadedretryable

Non-standard, Anthropic-specific: the API is temporarily overloaded and shedding load.

AI APIs: Anthropic's dedicated overload signal (overloaded_error). Distinct from 429 — this is not about YOUR rate limit but about aggregate platform load. Typically resolves within seconds to minutes. The SDKs retry it automatically a limited number of times.

100%
client-side compute
0
uploads — verify in devtools
96
free tools in the directory
0
network requests per keystroke

How it works

This is an HTTP status reference written for people who ship against AI APIs. Generic status-code lists tell you that 429 means "too many requests"; this one tells you that Anthropic sends Retry-After seconds and splits limits across input and output tokens, that OpenAI exposes remaining quota in x-ratelimit headers, and that Google wraps the same condition in a RESOURCE_EXHAUSTED error. The grid above is searchable by number or name and grouped by class, with the AI-specific notes inline.

Each code links to a dedicated page with the full treatment: what the code means, the AI-API-specific failure modes it shows up in, a ranked fix checklist, and — for retryable codes — a TypeScript retry helper with exponential backoff and jitter that honors Retry-After. The pages also link the relevant RFC section, because the primary source settles arguments faster than any blog post.

A few codes deserve special attention in AI workloads. 429 dominates at scale and is really three different limits wearing one number: requests per minute, tokens per minute, and concurrency. 529 is Anthropic's platform-overload signal, distinct from your personal rate limit and usually short-lived. 524 is Cloudflare's 100-second proxy timeout, the classic killer of long non-streaming generations — the model keeps working, the proxy gives up, and you may still be billed. And 499 in your gateway logs means your own client hung up first, which usually points at a timeout set lower than worst-case generation time.

The retry rule that falls out of all this: honor explicit Retry-After headers, retry 5xx and the 408/429 special cases with exponential backoff plus jitter, and never blind-retry validation failures — a 400 retried ten times is ten identical failures that cost you rate limit headroom. When errors persist beyond a few backoff cycles, check the provider status page before debugging your own code; our Provider Status tool keeps the links and recent incident history one click away, alongside the official status-page links for each provider.

Frequently asked questions

Which HTTP status codes are safe to retry?

Retry the transient ones: 408, 429 (after honoring Retry-After), 500, 502, 503, 504, the Cloudflare 52x family and Anthropic's 529. Never blind-retry 4xx validation errors — a 400 or 401 will fail identically every time and just burns rate limit. The grid marks each code retryable or not, and every detail page for a retryable code includes a TypeScript backoff snippet you can lift directly.

What does Anthropic's 529 Overloaded error mean?

529 is Anthropic's non-standard signal that the platform as a whole is shedding load — it is about aggregate capacity, not your personal rate limit, which is what distinguishes it from 429. It typically clears within seconds to minutes. The right response is exponential backoff with jitter and reduced concurrency; the official SDKs already retry it a limited number of times before surfacing it to your code.

Why do my long AI requests fail with 524?

524 is Cloudflare's proxy timeout: the origin accepted the connection but produced no response bytes within roughly 100 seconds. A slow non-streaming generation trips it even though the model is still working — and you may be billed for tokens you never received. The fix is streaming: once bytes flow continuously, the idle timer never fires. Any request that can take more than about 90 seconds should stream.

How should I handle 429 rate limits across providers?

First, honor the Retry-After header exactly when present — Anthropic and OpenAI both send it, and guessing your own delay when the server told you the answer is self-sabotage. Second, distinguish request-per-minute from token-per-minute limits; OpenAI's x-ratelimit-remaining headers report both. Third, smooth bursts with a client-side token bucket instead of letting retries synchronize into thundering herds.

Is a 4xx error my fault or the provider's?

By definition 4xx means the client — your request — needs to change: bad auth (401), insufficient permissions (403), invalid parameters (400/422), too large (413) or too fast (429). 5xx means the server failed and your request may be fine as-is. The practical consequence: 5xx codes deserve retries, 4xx codes deserve code changes — except 408 and 429, which are 4xx-but-retryable special cases.

Built by FORG — AI cost observability for agentic coding. Free tools, no signup, nothing leaves your browser.

Learn about FORG