Skip to main content

Token Budget Policy Builder

Define per-team and per-developer budget rules and export them as enforceable config.

100% client-side⎘ exportable output⌁ zero network calls
Team shares

Shares sum to 100%.

# Token Budget Policy

**Org monthly AI budget:** $12,000.00
**Enforcement action at 100%:** Throttle — reduced rate until next period
**Per-developer daily cap:** $60.00 (runaway-agent circuit breaker)

## Team allocations

| Team | Share | Monthly budget |
|---|---|---|
| Platform | 40% | $4,800.00/mo |
| Product | 35% | $4,200.00/mo |
| Data | 25% | $3,000.00/mo |

## Alert thresholds

- **50%** — informational pace check; no action expected.
- **80%** — working alert: team lead reviews what changed and decides whether to slow down or request more budget.
- **100%** — enforcement: throttle — reduced rate until next period.

## Rules

1. Each team budget is owned by its engineering manager; overage requests go to the org budget owner.
2. The per-developer daily cap of $60.00 applies regardless of team budget remaining — it bounds the blast radius of any single runaway session to one day of one developer's allowance.
3. Threshold alerts go to the team budget owner; the 100% event is also recorded for the monthly spend report.
4. This policy is reviewed quarterly; budgets follow measured usage, not last year's guess.

---
*A policy document enforces nothing by itself — pair it with tooling that meters usage in real time and applies the action above automatically.*
*Generated with the FORG Token Budget Policy Builder (forg.pro/tools/token-budget-policy).*
markdown export, no lock-in
100%
generated locally
0
signup walls
0
network requests per keystroke

How it works

This builder turns budget intentions into two artifacts that stay in sync: a human-readable policy document and a machine-readable JSON config, both generated live from the same inputs. Set the org monthly budget, split it across teams (the builder validates the shares sum to 100%), set a per-developer daily cap, tune the alert thresholds, and choose what happens when a budget is hit — alert, throttle or block. Switch between the two output tabs and copy whichever you need. Everything runs in your browser.

The two-output design addresses the way budget governance usually fails. Most orgs have either a policy nobody enforces or enforcement nobody documented; the drift between the two is where incidents and arguments live. Generating both from one source means the document a manager approves and the config a system loads describe the same rules, by construction. The JSON is deliberately plain — budgets, shares, caps, thresholds, action — so it maps onto whatever enforcement layer you run.

The defaults encode patterns that survive contact with real teams. Three escalating alert thresholds (50/80/100) give each message a distinct expected response, from pace check to enforcement, instead of one alert that gets muted. The per-developer daily cap is the piece monthly budgets cannot replace: it bounds the blast radius of a runaway agent loop to a single day of a single developer's allowance. And the enforcement action is a spectrum, not a moral stance — alert while baselines are guesses, throttle once they are measured, block only where overage truly beats stopped work.

The honest caveat is the same one every policy tool owes you: a document and a JSON file enforce nothing by themselves. The config needs a system that meters usage per developer in real time, evaluates thresholds and applies the chosen action — which is precisely the layer FORG provides. Use this builder to decide and document the rules with your team; use enforcement tooling to make the cap something that happens rather than something that was written down.

Frequently asked questions

Why do I need both a human-readable policy and a machine config?

Because they serve different audiences and fail differently. The markdown policy is for people — managers approving it, developers understanding what happens at the cap, auditors checking governance exists. The JSON config is for systems — the thing your enforcement tooling actually reads. Keeping them generated from the same inputs guarantees they never drift apart, which is the classic failure mode: a policy document promising caps that no system enforces, or enforcement rules nobody documented.

How should I split the budget across teams?

Start proportional to headcount, then adjust for workload reality: a platform team running agents in CI burns multiples of what a team doing occasional completions does. The builder validates that shares sum to 100% so you cannot accidentally over-allocate. Resist the temptation to leave slack unallocated as a hidden buffer — make the buffer an explicit line instead, owned by whoever arbitrates mid-month overage requests, so the negotiation has a named owner.

What are sensible alert thresholds and why three of them?

The default 50/80/100 pattern maps to three different responses. Fifty percent mid-month is informational — pace check, no action. Eighty percent is the working alert: the team lead looks at what changed and decides whether to slow down or request more. One hundred percent triggers the enforcement action you chose. Three thresholds work because one alert is noise people unsubscribe from, and continuous alerts are worse; the escalating pattern means each message carries a different expected response.

Should the enforcement action be alert, throttle or block?

Match it to the blast radius of being wrong in each direction. Alert-only is right while you are building trust and your usage baselines are still guesses — getting blocked by a miscalibrated cap teaches developers to route around the system. Throttle is the strong default once baselines are real: work continues at reduced pace and runaway loops get contained. Hard block is for environments where an overage is genuinely worse than stopped work — rare in practice, common in procurement imaginations.

What good is a per-developer daily cap on top of team budgets?

It is your runaway-agent circuit breaker. Monthly team budgets catch slow drift but are far too coarse for the failure mode that actually hurts: an agent loop or retry storm that burns through hundreds of dollars in an afternoon. A daily per-developer cap bounds the worst case of any single incident to one day of one person's allowance, which turns a potential month-killer into a footnote. Set it at roughly three times a heavy user's normal day so legitimate spikes clear it.

Turn this analysis into a live rule with the FORG rule engine — route models and enforce limits automatically.

Explore the rule engine