Skip to content
The standard

How the CAI is computed.

The Codebase Assurance Index rolls ten lens scores into one 0–100 number under a published, versioned rubric. This page is the authoritative specification of that fold — what contributes, what re-normalises, what caps the headline, and what can never move the number.

The roll-up

A weighted roll-up — not an opaque average.

Every lens score is itself a fold of scored dimensions; the headline is a weighted roll-up of the lenses under the frozen rubric. You can see inside every step.

Weighted, not averaged

The CAI is a weighted roll-up of the lens scores under the rubric in force — not an average you can't see inside. The weights are published with each rubric version.

Core always counts

The five always-on lenses — Code Health, Architecture, Maturity, Readiness, Security & Compliance — always contribute to the headline, on every codebase.

Conditional lenses re-normalise

The five conditional lenses — Domain Modelling, Event-Driven, Event Sourcing, Accessibility, Performance — contribute only when the code calls for them, and the weights re-normalise, so a repo is never penalised for a lens that doesn't apply.

The presentation bands

Five bands, cut at 25 / 50 / 70 / 90.

Every CAI renders on the same fixed worst→best scale — Critical / Weak / Adequate / Strong / Exemplary — with a Score pin at the exact spot. Position on the fixed scale *is* the reading; banding is presentation only and never moves a number.

CriticalWeakAdequateStrongExemplary
The pin above marks 62 — Adequate, upper third. The same five hues and cutlines render on every card, report and registry surface.
Bounds on the headline

The one number can't hide a serious failure.

The cap

A critical lens caps the headline.

The roll-up can't read Strong while a lens reads Critical: a single critical-band lens caps the CAI, so strong scores elsewhere can never paper over a serious failure in one dimension of the system.

The floor

The contract floor is decomposable.

A contract floor of CAI ≥ 80 is not an opaque threshold: it means every always-on lens reads Strong or better and no lens reads Critical. Decomposable, not opaque — either side can check what the floor is made of.

The firewall

Nothing moves the number but the code.

The deterministic score sits on one side of a firewall; everything advisory sits on the other — and it can never cross.

Deterministic vs advisory

The AI only ever advises.

Most dimensions are measured by deterministic tools reading the code. A few carry an advisory, tolerance-banded LLM read that explains in plain English — and can never, by construction, move the headline number. The measurement stays pure.

Scores sacrosanct

Inputs never score.

Compliance declarations, suppressed findings, contract profiles — they change what an artifact says, never the CAI. Neither party to a contract can tilt the number; only the code changing moves it (or a disclosed advisory refresh, like a new CVE).

The behavioural dimensions

Four dimensions are read from git history.

The standard's vocabulary includes behavioural dimensions mined from the repository's history — part of the same deterministic fold.

Hotspots

Where change concentrates — churn × complexity, the files most likely to hurt next.

Bus factor

Key-person risk — which modules depend on one contributor's knowledge.

Knowledge freshness

Code everyone who understood it has gone quiet on.

Change coupling

Files that change together without a declared dependency — hidden structure the code doesn't admit to.

This page is the authority.Watchdog's methodology page summarises this specification for its own audience and links here. Where a summary and this specification disagree, the specification wins.

Read the vocabulary, then check a number against it.

Rubric versioning and contestability → /rubric