Most organisations deploying AI in regulated sectors underestimate what governance actually demands. They think in terms of policies — a document, an approval step, a sign-off from legal. What auditors, regulators, and CISOs are increasingly asking for is something more rigorous: a complete, verifiable record of what AI is running, who authorised it, what it was trained on, and what happens when it produces an output that matters.

This article covers what that record looks like in practice, where most organisations fall short, and what a defensible governance posture actually requires.

The three things regulators want to see

Across financial services, healthcare, and professional services, regulatory interest in AI has converged on three core questions:

  1. Can you explain a decision? For AI-assisted decisions that affect people or organisations — credit, risk classification, clinical triage, legal research — regulators want a documented reasoning chain. Not just "the model said so."
  2. Can you prove the model is appropriate for the use case? This means documented evaluation, evidence that the model was tested against domain-specific failure modes, and records of who reviewed those tests.
  3. Can you demonstrate that controls are in place? Who can invoke the model? What data does it see? What outputs can it produce? What happens if it produces something wrong?

These questions are not hypothetical. They are the basis of supervisory review under frameworks including the EU AI Act (for high-risk systems), FCA guidance on model risk in financial services, and NHS digital governance standards in the UK.

What "documentation" actually means here

The word documentation gets used loosely. In a governance context, it has a specific meaning:

  • Model cards — a structured summary of what the model does, its intended use, known limitations, and evaluation results. Originally a Google research convention, now expected by risk teams and increasingly by regulators.
  • Data lineage records — where training or retrieval data came from, when it was last updated, and what preprocessing was applied. For retrieval-augmented systems, this includes the document corpus.
  • Change logs — a version-controlled history of model changes, prompt changes, and configuration changes. When a model was updated matters as much as what it was updated to.
  • Risk assessments — a documented evaluation of what can go wrong, at what frequency, with what consequence. For high-risk systems under the EU AI Act, this is mandatory. For everything else, it is increasingly expected.
  • Access and audit logs — who used the system, when, with what inputs, and with what outputs. These need to be retained and queryable.
The question is not whether your AI works. The question is whether you can demonstrate, after the fact, that it worked as intended and that you had the controls in place to catch it when it didn't.

Where most organisations fall short

The most common failure is not bad governance — it is invisible governance. AI tools get deployed through procurement, through individual teams spinning up API access, or through SaaS products that include "AI features" in a minor update. None of these routes naturally produce governance documentation. By the time an organisation tries to conduct an inventory, they often find that they cannot answer the most basic question: what AI is actually running?

The second failure is treating governance as a one-time exercise. Models change. Prompts drift. Data pipelines update. An accurate governance record from six months ago may not reflect the current state of the system. Governance needs to be continuous, not episodic.

The third failure is confusing policy with process. A policy that says "all AI deployments must be reviewed by the risk committee" is not governance. Governance is the actual review record, the approval artefact, and the monitoring process that catches drift after approval.

What a defensible posture looks like

A defensible AI governance posture in a regulated environment has four characteristics:

1. A complete inventory

Every AI system in use — including third-party tools with AI features — is catalogued. The catalogue includes the vendor, the model, the version, the data it accesses, the decisions it informs, and the business owner responsible for it.

2. Risk-tiered controls

Not every AI system warrants the same governance overhead. A grammar correction tool in an email client is different from a model that assists with credit underwriting. A tiered framework applies proportionate controls based on consequence — how consequential is an error, and how reversible is it?

3. An audit trail that survives a question

If a regulator, a senior stakeholder, or an aggrieved customer asks "what did the AI do and why," the answer should be producible within hours, not weeks. This means structured logging, version-controlled prompts, and records that tie outputs to the model configuration that produced them.

4. A review cadence

Governance reviews happen on a schedule, not just at deployment. Quarterly is typical for high-risk systems. The review should cover: has anything changed, have there been incidents, and is the risk assessment still current?

The practical starting point

For most organisations, the starting point is not a governance framework — it is an inventory. You cannot govern what you have not catalogued. A readiness assessment that maps every AI system in use, identifies the ones that carry regulatory exposure, and documents their current governance state is the foundation everything else is built on.

That is the exercise we run with clients. It typically takes two to four weeks and produces a register that is immediately useful to risk, legal, and technology teams — and that forms the basis of a governance programme that can scale.