FORG — AI Control Plane

The Invisible Budget Line

Ask most engineering managers what their team spends on AI tools and you'll get one of three answers: a vague number that's almost certainly wrong, a statement that "it's just part of the tooling budget," or an honest admission that they have no idea.

This isn't incompetence. It's a structural problem. AI tool costs are diffuse by nature: Claude Code here, Cursor there, the Anthropic API called directly from a few scripts, GitHub Copilot on a few machines, the OpenAI API key in three microservices. None of these show up together in a single line item. The finance team sees the aggregate; the engineering team has no signal at all.

The Anatomy of AI Waste

We analyzed usage patterns across 50 engineering teams using FORG over a 6-month period. Here's what we found.

Idle sessions: 18% waste

Sessions opened for a quick question and left running. Claude Code maintains context as long as the session is active, and depending on how pricing works at any given point in time, that can mean ongoing costs. Across 50 teams, we found an average of 340 idle sessions per month per team — sessions that were started and never explicitly terminated after 30+ minutes of inactivity.

Wrong model tier: 13% waste

Using Opus or GPT-4o for tasks that don't require it. We measured median output token count by model across all teams: calls using top-tier models had a median output of 94 tokens — nearly identical to calls using mid-tier models. But cost 4–8× more per token. Model selection was essentially random for most teams because there was no policy and no visibility.

No prompt caching: 9% waste

Prompt caching (cache_control on large, repeated system prompts) can save 60–80% on input token costs for workloads with stable system prompts. Only 12% of teams we analyzed had caching configured correctly. The other 88% were paying full price for input tokens on every call.

Duplicate calls: 5% waste

Developers re-running the same or very similar prompts in the same session. Some of this is intentional (iterating on output), but most isn't. FORG identified that ~23% of API calls had a semantically similar call in the same session within the preceding 5 minutes.

The Cost of Not Knowing

Beyond the direct waste, there's a second-order cost: without visibility, you can't allocate AI costs to projects or teams. This means:

No project-level cost accounting.When a customer asks "what did it cost to build feature X?", you can't include AI tooling costs in the answer.
No accountability for high spenders.Some developers naturally use AI more intensively than others. Without visibility, you can't distinguish productive intensive use from waste.
No input to make-vs-buy decisions. Should you run your own LLM for certain tasks? You need cost data to answer that question.

The ROI Math

For a 20-person team spending $3,200/month on AI tools:

45% waste estimate = $1,440/month in unnecessary spend
Conservative 50% recovery with rules = $720/month saved
FORG Team plan cost = $199/month
Net savings: $521/month. ROI: 2.6× in month 1.

In practice, we see 30–50% cost reduction in the first 30 days for teams that implement budget rules and model policies. The savings grow over time as behavioral patterns shift and caching is properly configured.

The only cost is a few hours of setup and rule configuration. The FORG documentation walks through it step by step.

The Hidden Cost of AI Tools in Engineering Teams