The Hidden Cost of AI Tools in Engineering Teams
The average 20-person engineering team is spending $3,200/month on AI tools with no visibility into where it's going. We analyzed usage patterns across 50 teams to find the waste patterns — and quantify the ROI of fixing them.
Avg monthly AI spend, 20-dev team
$3,200
+28% YoY
Estimated untracked waste
45%
ROI of observability investment
11×
The Invisible Budget Line
Ask most engineering managers what their team spends on AI tools and you'll get one of three answers: a vague number that's almost certainly wrong, a statement that "it's just part of the tooling budget," or an honest admission that they have no idea.
This isn't incompetence. It's a structural problem. AI tool costs are diffuse by nature: Claude Code here, Cursor there, the Anthropic API called directly from a few scripts, GitHub Copilot on a few machines, the OpenAI API key in three microservices. None of these show up together in a single line item. The finance team sees the aggregate; the engineering team has no signal at all.
Monthly AI Spend Growth — Average 20-Person Engineering Team
Estimated composite spend including Claude Code, Cursor, direct API usage. Based on FORG customer cohort data.
The Anatomy of AI Waste
We analyzed usage patterns across 50 engineering teams using FORG over a 6-month period. Here's what we found.
Where the Money Goes (avg. team)
Idle sessions: 18% waste
Sessions opened for a quick question and left running. Claude Code maintains context as long as the session is active, and depending on how pricing works at any given point in time, that can mean ongoing costs. Across 50 teams, we found an average of 340 idle sessions per month per team — sessions that were started and never explicitly terminated after 30+ minutes of inactivity.
Wrong model tier: 13% waste
Using Opus or GPT-4o for tasks that don't require it. We measured median output token count by model across all teams: calls using top-tier models had a median output of 94 tokens — nearly identical to calls using mid-tier models. But cost 4–8× more per token. Model selection was essentially random for most teams because there was no policy and no visibility.
No prompt caching: 9% waste
Prompt caching (cache_control on large, repeated system prompts) can save 60–80% on input token costs for workloads with stable system prompts. Only 12% of teams we analyzed had caching configured correctly. The other 88% were paying full price for input tokens on every call.
Duplicate calls: 5% waste
Developers re-running the same or very similar prompts in the same session. Some of this is intentional (iterating on output), but most isn't. FORG identified that ~23% of API calls had a semantically similar call in the same session within the preceding 5 minutes.
The Cost of Not Knowing
Beyond the direct waste, there's a second-order cost: without visibility, you can't allocate AI costs to projects or teams. This means:
- No project-level cost accounting.When a customer asks "what did it cost to build feature X?", you can't include AI tooling costs in the answer.
- No accountability for high spenders.Some developers naturally use AI more intensively than others. Without visibility, you can't distinguish productive intensive use from waste.
- No input to make-vs-buy decisions. Should you run your own LLM for certain tasks? You need cost data to answer that question.
The ROI Math
For a 20-person team spending $3,200/month on AI tools:
- 45% waste estimate = $1,440/month in unnecessary spend
- Conservative 50% recovery with rules = $720/month saved
- FORG Team plan cost = $199/month
- Net savings: $521/month. ROI: 2.6× in month 1.
In practice, we see 30–50% cost reduction in the first 30 days for teams that implement budget rules and model policies. The savings grow over time as behavioral patterns shift and caching is properly configured.
The only cost is a few hours of setup and rule configuration. The FORG documentation walks through it step by step.