Skip to main content

AI Readiness Assessment

A 12-question maturity quiz scoring your team's readiness for agentic development.

100% client-side⎘ exportable output⌁ zero network calls
Question 1 of 12 · ToolingTooling
How standardized is your AI tool setup?
Scaling

16/24Repeatable processes are spreading. Focus on measurement and enforcement.

Per-dimension score

Tooling4/6
Process4/6
Security4/6
Culture4/6

No dimension below 4/6 — maintain the loop and re-verify quarterly.

markdown export, no lock-in
100%
generated locally
0
signup walls
0
network requests per keystroke

How it works

This assessment scores your team's readiness for agentic development with twelve questions across four dimensions: tooling, process, security and culture. Each question has three answers worth zero, one or two points — pick the one closest to your reality, not your aspiration. The result maps your total to one of four maturity levels (Exploring, Adopting, Scaling, Optimizing), breaks the score down per dimension with bars, and names concrete first moves for every dimension that lags. Your answers are encoded in the share link and never leave your browser.

The four-dimension structure exists because readiness is a weakest-link property. Averages flatter: a team that aces tooling and fails security averages to "adopting" while actually being one paste event away from an incident. Tooling asks whether sanctioned, configured, documented setups exist. Process asks how AI-assisted code is reviewed, how knowledge spreads, and whether anyone measures outcomes. Security asks about data rules, secret scanning and budget controls. Culture asks whether people can learn openly, whether champions exist, and how leadership treats the whole topic.

Score honestly and the gaps become a roadmap. The recommendations under the bars are deliberately small and specific — standardize one setup, write the review rule, turn on secret scanning, give champions time — because maturity moves through completed quarters, not announced initiatives. Re-run the assessment quarterly: a dimension that has not moved in two quarters despite attention is telling you the fix you chose is not the fix it needs.

One pattern worth anticipating: most teams score weakest on the measurement questions inside process and security — who tracks usage, who owns budgets, what happens at the cap. That is unsurprising, because measurement is the part that needs infrastructure rather than intent. It is also the part FORG handles directly: per-developer usage tracking and enforceable budget caps are two of the twelve answers here, available as tooling rather than aspiration. The other ten you still have to earn the slow way.

Frequently asked questions

What do the four maturity levels actually mean?

Exploring means AI usage is individual and informal — no shared setup, no policy, no measurement. Adopting means the basics exist: sanctioned tools, some documentation, early guardrails, but unevenly applied. Scaling means the org has repeatable processes — onboarding, budget controls, review standards — and is extending them to every team. Optimizing means the loop is closed: usage, quality and cost are measured continuously and the data drives quarterly decisions about tools, training and spend.

Why are the questions split across tooling, process, security and culture?

Because readiness fails at its weakest dimension, not its average. A team with excellent tooling but no review process ships unreviewed AI code; a team with strong security rules but a hostile culture drives usage underground where the rules cannot see it. Scoring each dimension separately turns a vague 'are we ready?' into a specific 'security is two levels behind everything else' — and a specific gap has an owner and a fix.

What should we do with a low score in one dimension?

Treat it as the bottleneck and fix it before scaling anything else, because rollout pace is set by the weakest dimension. The results panel names concrete first moves per gap: for tooling, standardize one sanctioned setup and document it; for process, define review standards for AI-assisted code; for security, get secret scanning and data rules in place; for culture, recruit champions and make learning time legitimate. One quarter of focused work typically moves a dimension a full level.

Who should take this assessment — one person or the whole team?

Several people independently, then compare. The most useful signal is often the disagreement: when engineering managers score process two points higher than the engineers who live in it, that gap is itself a finding. Have each person complete the twelve questions without conferring, share results with the link button, and discuss the dimensions where answers diverge. Consensus scores hide exactly the friction you need to surface.

How often should we re-assess?

Quarterly, aligned with whatever planning cadence you already run. The assessment takes two minutes, so the cost is negligible, and the value compounds: a flat score across two quarters in a dimension you claimed to be fixing is an honest signal that the fix is not landing. Keep the share links from each run — the URL encodes your answers — and you have a free longitudinal record of the rollout without standing up any tracking infrastructure.

Budgeting AI spend for a team? FORG is $15/dev with hard budget caps and per-seat attribution.

See FORG pricing