Skip to main content
Optimize · Savings

Cut AI spend automatically.

Prompt caching, model downgrades, and on-device nudges trim the bill while your team keeps shipping. Every saving is measured against a real baseline — never a guess.

forg savings --summary
$ forg savings --summary
cache hit   61%  reused prompts
downgrades  142 calls → cheaper model
duplicates  38  served from cache
────────────────────────────
baseline    $112.93  this month
saved       $28.41  vs baseline
On-deviceenforcement, no proxy
Metadata-onlycapture, no payloads
6control surfaces
Ed25519signed releases
14tools supported
8.5MBon-device agent
how it saves

Three levers, all measured

Cache reuse

Reuse identical prompt prefixes automatically. Skip the tokens you already paid for last time.

Model downgrade nudge

Route simple calls to a cheaper model when the result is equivalent. Measured, not guessed.

Duplicate detection

Spot repeated requests within a session and serve the cached result before billing the model.

Baseline reconciliation

Every saving measured, not estimated

FORG compares every optimised call against what it would have cost at full price. The delta is your saving — reconciled daily against the same baseline so finance can trust the number.

See full observability →

Savings · this month

Baseline (no optimisation)$112.93
Actual (with FORG)$84.52
Total saved$28.41 · ↓25%

Pay for the tokens you actually need

Start your 14-day trial and watch the baseline drop.

Start 14-day trial