A confidence number without a per-dimension breakdown and a recommended action isn't auditable. It's a vibe with a decimal point. This is the scoring engine that started inside BoardPath, extracted into a zero-dependency npm package because the most reusable part of a product is the part that doesn't know which product it's in.
BoardPath had a scoring engine before it had paying customers. Every answer the system gave a board came back with a 0–100 confidence score and a per-dimension breakdown — authority, grounding, retrieval quality, and the rest. It was the part of the product that made a skeptic willing to act on an AI answer.
Then the same logic kept getting reached for in other projects. A forensic document tool wanted it. A side experiment wanted it. Each time, a slightly different copy of the same scorer got pasted in — and each copy drifted a little further from the others. A bug fixed in one didn't reach the others. A new dimension added in one made the rest quietly out of date.
The logic wasn't specific to any of the three products it lived inside — which is exactly why it shouldn't have lived inside any of them. That's the moment a feature is telling you it wants to be a library.
The scoring engine never knew what a CC&R was. It didn't retrieve anything. It didn't call a model. It took signals a RAG pipeline already produces — retrieval scores, document metadata, citation overlap, how much the retrieved chunks agreed with each other — and composed them into one auditable number with a reason attached to every point.
The governance lived in which documents got ranked how. The scoring was
domain-agnostic the whole time — the boundary just hadn't been drawn. Drawing it meant
pulling the scorer out and publishing it as transparent-confidence.
Every scorecard is built from eight independently weighted dimensions, each with its own reason string so the final number can always be taken apart and defended:
It ships as dual ESM/CJS, fully typed, with 412 tests and versioned algorithm schemas so a score computed today can be reproduced tomorrow. The whole library is small enough that there is nothing to audit but the package itself.
Extracting the scorer forced an honesty that was easy to skip while it was buried in a product. Three decisions defined what it became.
If the job is scoring and policy — not retrieval, not inference — it should run with what the caller already has. No ML stack, no server, no model calls. For a library whose entire purpose is trust, a dependency tree is a liability. Zero-dependency is a hard architectural constraint here, not a preference.
This is the line that separates it from RAGAs, TruLens, and DeepEval. Those are evaluation
frameworks — they run offline, after the fact, and call an LLM to judge answer quality.
transparent-confidence runs inline, the moment the system answers,
using signals already in hand. No extra round-trip. It sits next to the eval tools, not on
top of them.
Every scorecard comes back with a recommendation — answer,
review, or abstain — and a reason string. A score you have
to interpret is a score you'll ignore under load. The point is to gate on it: drop below
your threshold and the question routes to a human before the user ever sees the response.
The most important thing one of these systems can say is the corpus does not address
this — and the package is built so it can say that out loud.
Extraction made the dimension set cleaner and the weights documented. It also gave the library a calibration story it didn't have before: the score is not a probability of correctness until you calibrate it against your own labeled outcomes — and the README now says exactly that, because a confidence number that overpromises is worse than no number at all.
BoardPath still uses it. So does everything else here that touches retrieval. The copy-paste drift is gone, because there's one source now — v0.3, Apache-2.0, published on npm with its algorithm docs and dimensions open for inspection. It wasn't about scoring. It was about boundaries.
Repo: github.com/emtcmca/transparent-confidence · Companion write-up: The Confidence Layer Didn't Belong to BoardPath