Optimize

Find the Waste. Cut It Automatically.

FORG's statistical ML engine analyses every session, prompt, and model call — then surfaces ranked recommendations with projected ROI. No LLM required. Pure signal.

$47K
avg annual savings
31%
avg cost reduction
< 24h
to first insight
Zero
manual analysis
Start saving now See how it works

Four ways FORG cuts your bill

Automatically detected. Ranked by savings potential. Ready to act on.

Model right-sizing

Saves up to 85% per call

FORG detects you're using GPT-4o for simple lookups — tasks where GPT-4o-mini produces identical output at a fraction of the cost. It surfaces every mis-matched model/task pair.

Token efficiency

2,400 wasted tokens/session avg

Bloated system prompts, redundant context injection, and over-provisioned windows all cost real money. FORG measures actual token utilization and shows exactly where to trim.

Redundancy detection

47× same query this week

The same query pattern re-run 47 times in one week is a caching candidate. FORG fingerprints every session, clusters near-duplicates, and flags opportunities for deterministic cache hits.

Off-peak routing

23% of usage is off-hours

A quarter of your AI traffic runs overnight when cheaper model tiers are available and latency doesn't matter. FORG routes off-hours batch work to lower-cost endpoints automatically.

Where your AI budget goes

Token spend by model — current month

45%
28%
15%
12%
GPT-4o
$381
45% of total
Claude Opus
$237
28% of total
Claude Sonnet
$127
15% of total
Others
$102
12% of total

This week's recommendations

Ranked by projected savings. Updated daily.

5 active recommendations
HIGH IMPACT
Switch GPT-4o → GPT-4o-mini for code lookups
47 sessions this week used GPT-4o for simple variable lookups and autocomplete. GPT-4o-mini produces identical results at 85% lower cost.
Team: eng-platform
$312/mo
HIGH IMPACT
Cache repeated document summary calls
14 engineers summarised the same onboarding doc 47 times this week. One cached response eliminates 97% of those calls.
Team: onboarding-squad
$230/mo
QUICK WIN
Trim bloated system prompts
Average system prompt is 2,400 tokens — 1,800 are boilerplate repeated verbatim every session. Extract to a shared context block.
Team: product-ai
$87/mo
QUICK WIN
Off-hours script rate-limiting
3 automated scripts ran overnight with no rate limit, burning 1.2M tokens outside business hours. Schedule restrictions fix this today.
Team: devops
$63/mo
REVIEW
Reduce context window over-provisioning
Average context utilization is 23%. Engineers send full repo context when only the active file is needed — 77% of tokens wasted.
Team: all-engineers
$44/mo

Before vs. after FORG

Monthly spend simulation based on current usage patterns

Current state
GPT-4o (production)$412
Claude Opus (research)$238
Claude Sonnet (drafts)$127
Automated scripts$70
Monthly total$847
After FORG optimisations
GPT-4o (production)
−$132$280
Claude Opus (research)
−$119$119
Claude Sonnet (drafts)
−$38$89
Automated scripts
−$49$21
Monthly total
$509
save $338/mo

How FORG finds savings

Pure statistical ML. No LLM in the analysis path — just math.

01

Classify

ML engine categorises every session by task type, complexity tier, and output quality. Builds a ground-truth map of what each model is actually being used for.

02

Benchmark

Compares your usage patterns against optimal model/token ratios derived from aggregate anonymised data. Identifies every gap between what you're spending and what's necessary.

03

Recommend

Surfaces ranked, actionable changes with projected monthly savings and implementation effort. No vague advice — specific models, specific call sites, specific ROI.

Real numbers from real teams

Anonymised at customer request. Savings independently verified.

Series B fintech
44%
cost reduction in 60 days

Switched 3 high-volume pipelines from GPT-4o to Claude Haiku after FORG classified them as simple extraction tasks.

50-person startup
$8,200
saved in first month

Caching alone accounted for $5,400 of first-month savings. FORG detected 12 repeated summarization loops on day 1.

Enterprise AI team
11 days
to full ROI

License paid for itself before the second billing cycle. Ongoing savings now 31× the annual subscription cost.

First report

Get your first optimisation report

Install FORG, connect your team, and receive a full savings analysis within 24 hours — before you spend a cent on a subscription.