Three model risk questions every board should be asking

The three questions about AI model risk that boards should be asking management about every quarter, with what good answers look like and what unacceptable answers sound like.

Model risk has joined the small list of categories that boards now have to engage with directly, regardless of the industry. The traditional governance categories (financial risk, operational risk, regulatory risk, cyber risk) all have well-established review patterns. Model risk is newer and has not yet settled into a comparable rhythm. Boards are asking about it more than they were a year ago. They are still mostly asking the wrong questions.

This is the working version of the three questions about model risk that a board should be asking management every quarter, what good answers look like, and what unacceptable answers should be challenged.

Question one: what models are in production, and who signed off

The first question is the simplest and the one most often missing. The board should ask, every quarter, for an inventory of the AI and machine learning models in production at the organization. The inventory should include what each model does, what data it relies on, who owns it, who approved its deployment, and when it was last reviewed.

The question is simple. The answer is often not. Many organizations cannot produce this list with confidence. Models are deployed by individual teams without central tracking. Vendor-provided models are integrated without the organization's full understanding of what they do. Models that were proof-of-concepts a year ago are quietly serving production traffic now.

A good answer to this question includes a complete list, with named owners, with dates, and with a clear sign-off trail. The list distinguishes between high-impact models (those affecting customer-facing decisions, regulatory positions, or material business outcomes) and low-impact ones. The high-impact models have explicit governance attached.

An unacceptable answer is "we are still building the inventory" or "we have a sense of the major models." Either answer means the organization does not actually know what is in production. That is a model risk in itself, and the board should be uncomfortable with it.

The board's pressure on this question, repeated quarter after quarter, is what produces the inventory. Without the pressure, the inventory remains aspirational and the actual list of production models grows in the background.

Question two: how do we know each model is still doing what it was supposed to do

The second question is about ongoing model behavior. A model that worked acceptably at deployment can drift over time as the data it sees changes, as the model is retrained, or as the upstream systems it depends on evolve. The board should ask how the organization is monitoring each high-impact model for drift, and what the criteria are for taking corrective action.

A good answer includes specifics. Each high-impact model has metrics that are tracked over time. Each metric has thresholds that trigger investigation when crossed. There is a documented process for investigating a flagged model, including who is responsible, what the timeline is, and what the corrective options are. There is a record of past flagged events and how they were resolved.

The metrics depend on the model. For a credit decisioning model, the metrics include approval rates by segment, default rates by segment, and the fairness metrics required by the relevant regulators. For a fraud detection model, the metrics include false positive rates, false negative rates, and customer complaint volume. For a content recommendation model, the metrics include engagement, complaint rates, and content category distributions.

An unacceptable answer is "we trust the vendor" or "we run the standard reporting." Vendor monitoring is fine for some risk categories and not others. Standard reporting tells the team what the model is doing on average, not whether the model is failing in ways the team should care about.

The follow-up the board should ask is how the organization would notice if one of its high-impact models started behaving badly. The answer should reach a specific, named monitoring discipline, not a general claim that "we would see it."

Question three: what is our exposure if the model is wrong, and how is it bounded

The third question is about consequences. Every high-impact model has the potential to be wrong. The question for the board is what the organization's exposure looks like in the cases where the model is wrong, and what is in place to bound the exposure.

A good answer treats this question seriously and produces specifics. For each high-impact model, the answer includes the worst plausible outcome of the model being wrong (regulatory enforcement action, customer harm, financial loss, reputational damage), the steps that bound the outcome (human review on edge cases, override paths, rollback capabilities, audit trails sufficient to demonstrate due care), and the steps that recover from the outcome (incident response, customer remediation, regulatory disclosure).

The bounds matter as much as the model itself. A model that is occasionally wrong but whose wrongness is caught and corrected before it affects customers is a different risk profile than the same model with no override path. Boards should look for explicit bounds, not implicit ones.

An unacceptable answer is "we have not seen any major problems yet" or "we have insurance." The first is a statement about luck. The second is a statement about external recovery, not internal control. Both are signs that the organization has not actually thought about the question.

The right pattern for the board is to ask this question about a specific high-impact model, in detail, each quarter. Different model each quarter. The depth of the answer is more important than breadth. A board that asks for comprehensive coverage of every model usually receives shallow comprehensive answers. A board that asks for one specific model in depth tends to surface real issues, which then improve the answer for the next model.

What these three questions accomplish

A board that asks these three questions consistently and presses for substantive answers tends to produce a few outcomes over time.

The model inventory becomes real. The list exists, is current, and is reviewed by people who can actually evaluate the entries.

The monitoring becomes substantive. Each high-impact model has specific metrics, specific thresholds, and specific people accountable for the result.

The risk thinking becomes explicit. The conversation about each model is not "is this AI cool" or "is this AI scary." It is "what is the model doing, what would go wrong, what is in place to catch the wrong, and what happens if the catch fails."

These outcomes are the substance of model risk governance. They are not produced by hiring an AI ethics officer or by writing a policy document. They are produced by the board, repeatedly, asking the questions and not accepting the easy answers.

The relationship to the rest of the governance work

Model risk does not exist in isolation. It interacts with the organization's broader risk and governance posture in ways that the board should keep visible.

Information security policies need to address how AI systems handle and expose data. The model inventory should map to the data inventory. The monitoring should integrate with the existing security operations.

Regulatory engagement needs to anticipate AI-specific regulations including the EU AI Act, evolving SEC and federal banking guidance, and state-level developments. The compliance team's familiarity with the model inventory determines whether the organization can respond quickly to new requirements.

Financial reporting needs to address the materiality of AI-related risks. The board's audit committee should be in the loop on model risk, not just the technology committee.

Vendor management needs to extend to AI vendors. Most organizations are using AI primarily through third parties. The organization's exposure to a vendor's model failure is a vendor management question, and a model risk question, and the two need to be coordinated.

A board that integrates model risk with the rest of its governance work produces a posture that holds up under pressure. A board that treats model risk as a separate technology topic produces a posture that has model risk policies in one place and operational decisions about AI elsewhere, with the gap between them being where the actual risk lives.

The work to integrate this is not heavy. The discipline to do the work consistently is the harder part. The boards that do it well are usually the boards whose committees have explicit calendar time for the topic, whose questions have refined over multiple quarters, and whose management teams have come to expect substantive engagement on the topic. That expectation, more than any policy document, is what produces good model risk governance.