Savings you can audit.
Estimates you can trust.
FORG separates five tiers of savings — measured, reconciled, estimated, forecast, and suppressed — and never blends them. Actual savings only count what we can prove. Opportunities are explicit about being estimates until they are measured.
No silent automation. No "guaranteed" claims. No reversing of a real call just to inflate a number. Every entry is auditable end to end.
Five tiers, never blended
Every number FORG surfaces carries a label. Only measured and reconciled count toward actual totals.
Counted from real telemetry. Two independent checks must agree: the original token-cost is provable, and the avoided cost is provable. Reversals net to zero.
Examples: cache reuse, compaction on a real call, prevented runaway with provider-request-id proof
Carried forward from a measured prior period with no reversal in the current window. Conservative rolling credit.
Examples: cache hit on a follow-up session that references the original request
Projected from observed patterns. Never counts toward actual savings totals. Marked with safe_to_auto_apply = false until measured.
Examples: right-size suggestion, repeat-work fingerprint, off-hours tier recommendation
Hypothetical savings if a proposed policy were enabled. Always dry-run. Always read-only.
Examples: policy simulator output, threshold-change blast radius
Dropped from totals — abuse-guard rejected, sensitive-task carry, or duplicate of an existing entry.
Examples: fingerprint collision with prior window, security-flagged reversal pair
Four modules, each on its own
Each module is opt-in, dry-run by default, and never auto-applies.
Prompt-cache optimizer
Identifies repeated prompt prefixes that benefit from prompt caching. Reports cache-read vs cache-write token deltas.
- Counts cache hits against the model's published cache rate, not the full input rate.
- Marks each entry with the provider and the request id used to verify the cache.
- Reversal pair is automatic if a later call shows the same prefix in non-cached form.
Context-diff compaction
Measures tokens saved when an old context is replaced with a diff before being sent to the model.
- Compares the bytes-of-context sent to the model against the bytes-of-source the agent edited.
- Net savings = (source bytes − context bytes) × per-token rate, only when context bytes are lower.
- Reversal pair if the agent re-fetches the full source after compaction.
Budget broker
Advisory per-session and per-day cost ceilings. Surfaces a soft warning before crossing a hard cap.
- Hard gates require explicit opt-in and an audit note — never on by default.
- Sensitive tasks (security, compliance, production incident, unknown) are always excluded.
- Reports the projected false-positive rate before you enable.
Policy simulator
Dry-run preview of a policy change before it goes live. Always read-only, always labeled forecast.
- Returns a blast-radius object: orgs, sessions, expected false positives, sensitive sessions affected.
- Excludes sensitive sessions from the eligible pool by default regardless of policy flags.
- mutation_allowed: false and read_only: true are structural — the API cannot write through it.
Ten checks before promotion to verified
An opportunity must pass every gate to count toward the measured total. Any failure drops it to estimated or suppressed — never inflates.
Provider request id present
Links the saved call to a verifiable provider billable record.
Exact model price at call time
Pulls from model_pricing_history; rejects entries priced with a non-canonical rate.
No duplicate ledger row
evidence_hash buckets reversals so an abuse pair nets to zero.
No abuse-guard flag on the row
Drops anything the abuse guard classified as runaway or fingerprint-replay.
Not in a sensitive-task window
Entries that overlap security / compliance / production-incident / unknown are never counted.
Reversal partner is balanced
If a reversal exists, the pair must net to zero before promotion to verified.
What we do not claim
- FORG does not guarantee any specific savings outcome for your team.
- FORG does not silently enable budget gates, model downgrades, or compaction — every action requires an explicit, audited enable.
- FORG does not collect prompt text or completion content — savings come from metadata, which limits what we can measure.
- FORG does not count estimated opportunities toward your actual savings total. The two columns are always separate.
See your measured savings on day one
Install FORG, connect one adapter, and watch measured vs estimated split in real time.