AI Governance. From Risk Framework to Regulatory Compliance.
AI Governance Guide
AI governance is a layered discipline spanning voluntary risk frameworks, generative AI profiles, security control overlays, measurement methodologies, and binding regulation. NIST AI RMF establishes the foundation. NIST AI 600-1 addresses generative AI risks. COSAiS maps AI concerns to NIST 800-53 controls. The EU AI Act imposes legal obligations by risk tier. NIST IR 8596 defines how to measure AI trustworthiness. This guide connects them into a single governance lifecycle.
AI Governance
A governance lifecycle that connects risk management, security controls, measurement, and regulation.
AI systems introduce risks that traditional IT governance was not designed to address: emergent behavior, opaque decision-making, training data contamination, hallucination, and adversarial manipulation. The AI governance landscape has matured rapidly since 2023, producing a layered stack of frameworks, profiles, overlays, and regulations that build on each other. This guide walks through the complete governance lifecycle from foundational risk management through regulatory compliance, with each layer adding specificity and enforceability to the one before it.
AI governance exists because AI systems behave differently from traditional software. A misconfigured firewall fails in predictable ways. A misconfigured AI model can produce outputs that are biased, harmful, legally problematic, or factually wrong in ways that are difficult to detect, reproduce, or attribute to a specific technical failure. Traditional IT governance assumes deterministic behavior: define a policy, implement a control, verify the control operates as designed. AI systems are probabilistic. They produce different outputs for similar inputs. They learn from data that may contain biases invisible to the teams that curated it. They can be manipulated through adversarial inputs that exploit statistical vulnerabilities in model architectures. The governance challenge is not that existing frameworks are wrong. The challenge is that they are incomplete. Access control, encryption, logging, and patch management remain necessary. They are not sufficient. AI governance adds layers that address risks unique to machine learning systems: fairness, explainability, robustness, data provenance, and emergent behavior that was never explicitly programmed.
The governance landscape has produced five distinct layers since 2023, each building on the one before it. The NIST AI Risk Management Framework (AI RMF, AI 100-1) establishes the foundational vocabulary and lifecycle for managing AI risk. The NIST AI 600-1 Generative AI Profile extends that foundation with risks specific to large language models, image generators, and other generative systems. The COSAiS (Community of Standards for AI Security) initiative maps AI-specific risks to concrete NIST 800-53 security controls, connecting AI governance to the same control catalog used by CMMC, FedRAMP, and RMF. NIST IR 8596 defines measurement methodologies for quantifying AI trustworthiness characteristics. The EU AI Act introduces binding legal obligations with enforcement mechanisms, fines, and compliance deadlines organized by risk tier. These layers are not competing alternatives. They are complementary components of a single governance stack, each addressing a different question: what risks exist, what controls mitigate them, how do you measure effectiveness, and what does the law require.
Organizations that deploy AI systems face a choice between governing proactively and governing reactively. Proactive governance establishes the risk management structure, selects and implements appropriate controls, measures trustworthiness characteristics, and maps to regulatory requirements before an AI system enters production. Reactive governance discovers these obligations when an incident occurs, a regulator inquires, a contract requires attestation, or a board member asks what governance is in place. The proactive path is harder to start but cheaper to maintain. The reactive path appears easier until the cost of remediation, regulatory response, or reputational damage makes the alternative obvious. The governance lifecycle described in this guide follows the proactive path: start with risk management, layer in generative AI specifics, connect to enforceable security controls, establish measurement, and satisfy regulatory obligations as a consequence of genuine governance rather than a documentation exercise performed under deadline pressure.
Most organizations deploying AI systems have no governance structure in place. They have IT security policies that cover access control, encryption, and incident response. They may have data governance policies that address classification and retention. But AI-specific governance, covering model risk, training data provenance, output validation, bias detection, adversarial robustness, and human oversight requirements, either does not exist or exists as a policy document disconnected from operational reality. Teams deploy models into production because the business case is compelling. The model performs well in testing. The deployment timeline is aggressive. Governance is deferred to "phase two" or assigned to a committee that meets quarterly. The gap between deployment velocity and governance maturity widens with every new model, every new use case, and every new dataset integrated into the organization's AI portfolio. By the time governance becomes urgent, the organization has dozens of models in production with no inventory, no risk classification, no control mapping, and no measurement baseline.
Regulatory uncertainty compounds the governance gap. The EU AI Act established risk tiers and compliance deadlines, but many organizations outside the European Union assumed it did not apply to them. It applies to any organization that deploys AI systems whose outputs affect individuals in the EU, regardless of where the organization is headquartered. Federal agencies in the United States face Executive Order requirements for AI governance that reference NIST frameworks but leave implementation details to individual agencies. Defense contractors encounter AI governance requirements embedded in contract language that references standards still under development. Financial services organizations face supervisory expectations from regulators who expect AI model risk management without prescribing specific frameworks. Healthcare organizations deploying clinical decision support systems face FDA guidance that intersects with HIPAA obligations in ways that neither framework fully addresses independently. The result is a patchwork of obligations that varies by industry, geography, and use case. Organizations that wait for a single definitive standard will wait indefinitely. The governance landscape is converging, not consolidating, and the convergence point is a layered stack of complementary frameworks, not a single prescriptive checklist.
The absence of a standard AI control catalog is the structural root of the governance gap. NIST 800-53 rev5 provides over 1,000 security and privacy controls, but it was published before generative AI became operationally significant. The AI RMF provides a risk management lifecycle but does not prescribe specific security controls. AI 600-1 identifies generative AI risks but maps them to AI RMF subcategories, not to implementable controls. Organizations that attempt to govern AI systems using only traditional control catalogs discover the gaps immediately: no control addresses training data poisoning, no control addresses hallucination detection, no control addresses prompt injection, no control addresses model drift monitoring. Organizations that attempt to govern using only AI-specific guidance discover a different gap: risk categories without controls, controls without evidence requirements, and evidence requirements without measurement methodologies. The governance lifecycle in this guide resolves these gaps by connecting each layer to the next: AI RMF risk categories flow into AI 600-1 generative AI specifics, which flow into COSAiS control mappings, which connect to NIST IR 8596 measurement, which satisfies EU AI Act obligations. Each layer alone is insufficient. Connected, they form a complete governance stack.
The NIST AI Risk Management Framework (AI 100-1) is the foundational layer of the AI governance lifecycle. Published in January 2023, it establishes a voluntary framework for managing risks throughout the AI system lifecycle. The framework is organized into four core functions: Govern, Map, Measure, and Manage. Govern establishes the organizational structures, policies, and accountability mechanisms for AI risk management. It is the only function that applies across all AI activities, not just individual systems. Map identifies and classifies AI risks in context: what the system does, who it affects, what data it uses, and what harms could result from malfunction or misuse. Measure quantifies identified risks using metrics, testing, and evaluation methodologies appropriate to the system's risk profile. Manage implements controls, monitors effectiveness, and responds to risks that materialize or evolve. These four functions operate as a continuous cycle. Governance policies inform risk mapping. Risk mapping determines what to measure. Measurement results drive management actions. Management outcomes feed back into governance decisions.
The AI RMF introduces seven trustworthiness characteristics that define what "responsible AI" means in operational terms: valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful bias managed. These characteristics are not binary. They exist on a spectrum, and the appropriate level for each depends on the system's intended use, deployment context, and potential impact. A recommendation engine for product suggestions requires different trustworthiness thresholds than a clinical decision support system or a benefits eligibility determination tool. The AI RMF does not prescribe minimum thresholds. It provides the structure for organizations to determine appropriate thresholds based on their specific context and risk tolerance. This flexibility is intentional but creates an implementation challenge: without external guidance on what "sufficient" looks like for a given use case, organizations must make judgment calls that they may not have the expertise to make confidently. The subsequent layers of the governance stack (AI 600-1, COSAiS, NIST IR 8596, and the EU AI Act) progressively narrow this ambiguity.
Implementing the AI RMF requires organizational commitment beyond a single team or project. The Govern function demands executive sponsorship, defined roles and responsibilities for AI risk management, integration with existing enterprise risk management processes, and policies that address AI-specific concerns like acceptable use, procurement criteria for third-party models, and incident response procedures for AI failures. The Map function requires an inventory of AI systems, classification of each system's risk profile, identification of affected stakeholders, and documentation of intended uses and known limitations. The Measure function requires testing and evaluation methodologies, benchmark datasets, performance metrics, and bias detection procedures appropriate to each system's risk classification. The Manage function requires control implementation, monitoring infrastructure, incident response capabilities, and feedback mechanisms that connect operational experience back to governance decisions. Organizations that treat the AI RMF as a documentation exercise produce policies that no one follows, inventories that no one maintains, and risk classifications that no one updates. The framework only delivers value when it is operationalized: when the four functions are embedded in the workflows that build, deploy, and operate AI systems.
NIST AI 600-1, the Generative AI Profile, extends the AI RMF to address risks that are unique to or significantly amplified by generative AI systems. Published in July 2024, it identifies 12 risk categories specific to generative AI: confabulation (hallucination), data privacy, environmental impact, homogenization, human-AI configuration, information integrity, information security, intellectual property, obscene or degrading content, toxic or biased output, and value chain and component integration. These risks exist because generative AI systems operate differently from traditional machine learning. A classification model produces a label from a fixed set. A generative model produces novel content: text, images, code, audio. The novelty is the value proposition and the governance challenge simultaneously. The same capability that enables a model to generate a helpful response also enables it to generate a convincing fabrication, a privacy-violating disclosure, or content that infringes intellectual property. The risk is not that the model malfunctions. The risk is that the model functions exactly as designed in ways that produce harmful outcomes.
Each of the 12 risk categories maps back to AI RMF subcategories across the Govern, Map, Measure, and Manage functions, creating a traceable chain from generative AI risk to governance action. Confabulation maps to validity and reliability concerns in the Measure function: the organization must establish methods for detecting when a model generates plausible but false outputs. Information security maps to the Secure and Resilient trustworthiness characteristic: the organization must address prompt injection, training data extraction, model theft, and adversarial manipulation. Data privacy maps to the Privacy-Enhanced characteristic: the organization must govern how training data is sourced, whether personal information can be extracted from model outputs, and how data subject rights apply to information embedded in model weights. The profile also addresses risks that have no direct precedent in traditional IT security. Homogenization risk arises when multiple organizations rely on the same foundation model, creating correlated failure modes across an entire sector. Environmental impact risk addresses the computational cost of training and inference at scale. These categories require governance mechanisms that existing security frameworks do not provide, which is why the AI 600-1 profile exists as a distinct layer in the governance stack.
Operationalizing AI 600-1 requires organizations to evaluate each risk category against their specific generative AI deployments. Not every risk category applies to every use case. An organization using a generative model for internal code review faces different risks than one using a model for customer-facing content generation or clinical documentation. The profile provides suggested actions for each risk category, organized by AI RMF function (Govern, Map, Measure, Manage). These actions are recommendations, not requirements. The organization must determine which actions are appropriate given their risk tolerance, deployment context, and regulatory obligations. The practical challenge is that AI 600-1 operates at the risk identification layer. It tells you what can go wrong with generative AI. It does not tell you which NIST 800-53 controls to implement, how to configure them for AI workloads, or how to collect evidence that the controls are operating effectively. That gap is where COSAiS and the subsequent layers of the governance stack become essential. AI 600-1 identifies the risks. COSAiS maps those risks to implementable, assessable security controls.
The Community of Standards for AI Security (COSAiS) bridges the gap between AI risk frameworks and enforceable security controls. While the AI RMF and AI 600-1 identify risks and recommend governance actions, COSAiS maps those risks directly to NIST 800-53 rev5 controls with AI-specific parameter settings, supplemental guidance, and assessment procedures. This is the layer where AI governance connects to the same control infrastructure used by CMMC, FedRAMP, RMF, and every other framework that derives from NIST 800-53. The COSAiS overlays operate as extensions to existing 800-53 baselines: they do not replace existing controls but add AI-specific requirements to control families that apply to AI workloads. An existing AC-2 (Account Management) control gains additional parameters for managing access to training data, model weights, and inference endpoints. An existing AU-2 (Event Logging) control gains additional auditable events for model input/output logging, training pipeline execution, and prompt injection attempts. The result is an AI governance posture that is structurally compatible with the organization's existing compliance portfolio.
COSAiS addresses a critical gap in the governance stack: the absence of AI-specific control language within NIST 800-53 itself. The 800-53 catalog was last updated in September 2020 (Rev 5), before generative AI deployment became widespread. While many existing controls apply to AI systems (access control protects model endpoints the same way it protects any endpoint), others require AI-specific interpretation that the original control text does not provide. How does CM-3 (Configuration Change Control) apply to model retraining? What constitutes a "baseline configuration" under CM-2 for a model whose weights change with every training run? What does SI-4 (System Monitoring) mean for a system whose outputs are probabilistic and non-deterministic? COSAiS answers these questions by providing overlay parameters that extend existing controls with AI-specific implementation guidance. The overlay structure is deliberate: it preserves compatibility with existing assessment methodologies while adding the specificity that assessors need to evaluate AI system controls. Organizations already managing NIST 800-53 baselines for FedRAMP or RMF can activate the COSAiS overlays without restructuring their entire compliance program.
The practical impact of COSAiS is that AI governance becomes assessable using the same methodologies the organization already uses for its non-AI systems. A FedRAMP assessor evaluating an AI workload can assess the same 800-53 controls with AI-specific parameters rather than inventing a parallel assessment methodology. A CMMC assessor encountering AI components within a CUI-handling system can reference COSAiS overlay guidance to determine whether AI-specific risks are adequately controlled. An RMF authorizing official can include AI risk in the authorization decision using evidence collected through the same control assessment process used for every other system component. This structural compatibility is what makes COSAiS the connective tissue of the AI governance stack. The AI RMF provides the risk lifecycle. AI 600-1 provides generative AI specificity. COSAiS provides the control mappings that make risk management decisions enforceable, assessable, and auditable through existing compliance infrastructure.
The EU AI Act is the first comprehensive AI regulation with binding legal force. It entered into force in August 2024 with a phased compliance timeline extending through August 2027. The regulation classifies AI systems into four risk tiers: unacceptable risk (prohibited), high risk (subject to mandatory requirements), limited risk (transparency obligations), and minimal risk (no specific obligations). Prohibited practices include social scoring by governments, real-time biometric identification in public spaces (with narrow exceptions), and manipulation techniques that exploit vulnerabilities. High-risk systems include AI used in critical infrastructure, education, employment, essential services, law enforcement, and border management. The classification is based on the system's intended purpose and deployment context, not its underlying technology. A large language model used for internal document summarization faces different obligations than the same model used for employment screening or credit decisions. Obligations for high-risk systems include risk management, data governance, technical documentation, record-keeping, transparency, human oversight, accuracy, robustness, and cybersecurity.
The extraterritorial scope of the EU AI Act means it applies beyond European borders. Any organization that places an AI system on the EU market, deploys an AI system within the EU, or deploys an AI system whose output is used within the EU is subject to the regulation. A US defense contractor deploying an AI system that processes data about EU citizens in a NATO context may face obligations. A financial services organization using an AI credit scoring model that affects EU residents is subject to the Act regardless of where the model runs. Penalties for non-compliance are substantial: up to 35 million euros or 7% of global annual turnover for prohibited AI practices, up to 15 million euros or 3% of turnover for high-risk system violations, and up to 7.5 million euros or 1% of turnover for providing incorrect information to authorities. These are not theoretical maximums. The regulation establishes national competent authorities in each EU member state with investigation and enforcement powers. Organizations that assume the EU AI Act does not apply to them because they are not headquartered in Europe should examine their AI systems' outputs, data flows, and affected populations before reaching that conclusion.
The EU AI Act's requirements for high-risk systems map structurally to the preceding layers of the governance stack. Article 9 (Risk Management System) aligns with the AI RMF's four-function lifecycle. Article 10 (Data and Data Governance) addresses training data quality, representativeness, and bias concerns identified in AI 600-1. Article 15 (Accuracy, Robustness, and Cybersecurity) connects directly to NIST 800-53 controls extended by COSAiS overlays. Article 14 (Human Oversight) maps to AI RMF trustworthiness characteristics around accountability and transparency. Organizations that have implemented the preceding three layers of the governance stack (AI RMF, AI 600-1, COSAiS) have already satisfied a substantial portion of the EU AI Act's technical requirements for high-risk systems. The regulation adds legal formality, documentation requirements (conformity assessments, EU declarations of conformity, CE marking for certain systems), and ongoing obligations (post-market monitoring, serious incident reporting) that require additional governance mechanisms. But the substantive technical work, identifying risks, implementing controls, measuring effectiveness, is already addressed by the earlier layers. The EU AI Act does not invent new technical requirements. It makes existing best practices legally mandatory.
NIST IR 8596 (NIST Internal Report 8596) addresses the measurement gap in AI governance. The AI RMF defines seven trustworthiness characteristics. AI 600-1 identifies generative AI risks. COSAiS maps risks to controls. But none of these layers fully answers the question: how do you quantify whether your AI system is trustworthy enough? IR 8596 provides the measurement methodology. Published as a companion to the AI RMF, it defines approaches for measuring validity, reliability, safety, security, resilience, accountability, transparency, explainability, interpretability, privacy, and fairness. Each trustworthiness characteristic is decomposed into measurable properties with corresponding metrics, measurement methods, and evaluation criteria. The methodology acknowledges that AI measurement is context-dependent: the appropriate metrics for a computer vision system differ from those for a natural language processing system, and both differ from a reinforcement learning agent. IR 8596 provides the framework for selecting appropriate measurements rather than prescribing universal thresholds.
Measurement serves three purposes in the AI governance lifecycle. First, it establishes a baseline. Before an AI system enters production, measurement quantifies its trustworthiness characteristics against the thresholds defined during risk mapping. If fairness metrics show disparate impact across protected classes, that finding informs management decisions before deployment, not after an incident. Second, measurement enables continuous monitoring. AI systems degrade over time as the data they encounter in production diverges from the data they were trained on. Model drift is not a possibility; it is an inevitability for any model operating in a changing environment. Measurement detects drift by comparing current performance metrics against baseline values and alerting when degradation crosses defined thresholds. Third, measurement provides evidence for regulatory compliance. The EU AI Act requires that high-risk AI systems maintain documented levels of accuracy, robustness, and cybersecurity throughout their lifecycle. COSAiS controls require evidence that AI-specific security measures are operating effectively. IR 8596 measurements produce the quantitative evidence that satisfies both obligations.
The relationship between IR 8596 and the rest of the governance stack is structural. AI RMF Measure function activities reference the need for metrics and evaluation methodologies. IR 8596 provides those methodologies. AI 600-1 risk categories (confabulation, bias, information security) require quantification to determine severity and track remediation. IR 8596 defines how to quantify them. COSAiS controls require assessment evidence that AI-specific security measures operate effectively. IR 8596 measurements produce that evidence. EU AI Act conformity assessments require documented accuracy and robustness levels. IR 8596 provides the measurement methodology that produces those documentation artifacts. Without measurement, AI governance remains qualitative: risk categories are identified but not quantified, controls are implemented but effectiveness is asserted rather than demonstrated, and regulatory compliance is claimed without the metrics to support the claim. IR 8596 closes this loop by making trustworthiness measurable, comparable, and auditable.
AI governance does not exist in isolation. Organizations deploying AI systems are simultaneously subject to CMMC, FedRAMP, SOC 2, ISO 27001, HIPAA, or other compliance obligations depending on their industry, customers, and regulatory environment. The COSAiS overlay structure ensures that AI governance connects to these existing frameworks through the shared NIST 800-53 control catalog. A defense contractor pursuing CMMC Level 2 certification who also deploys AI systems within CUI-handling environments faces both CMMC and AI governance obligations. Because COSAiS extends existing 800-53 controls with AI-specific parameters, the contractor addresses both obligations through the same control infrastructure. The AI-specific overlay adds parameters to controls that CMMC already requires: AC-2 for access management, AU-2 for audit logging, CM-3 for configuration management, SI-4 for monitoring. The marginal effort to add AI governance to an existing CMMC compliance program is substantially less than building AI governance from scratch, because the control framework is already in place.
The integration extends beyond control-level overlap. Organizations pursuing FedRAMP authorization for cloud services that include AI capabilities face the FedRAMP Moderate or High baseline plus AI-specific requirements that emerge from the authorization process. The authorizing official (the Joint Authorization Board or an individual agency) may require AI-specific risk documentation as part of the System Security Plan. COSAiS overlay parameters slot directly into the existing SSP structure because they extend the same controls that FedRAMP already requires. SOC 2 Trust Service Criteria map to NIST 800-53 control families through published cross-walks. When those control families are extended by COSAiS overlays, the SOC 2 examination naturally encompasses AI-specific controls as part of the existing assessment scope. ISO 27001:2022 Annex A controls map to 800-53 through the NIST Cybersecurity Framework. The same structural compatibility applies. The key insight is that AI governance, when built on the COSAiS overlay model, does not require a parallel compliance program. It extends the compliance program the organization already operates.
Redoubt Forge manages the complete AI governance lifecycle through the same platform capabilities that serve traditional compliance. Rampart maintains the AI RMF assessment alongside CMMC, FedRAMP, SOC 2, and every other framework in the catalog. COSAiS overlay parameters are activated on the relevant 800-53 controls, extending existing assessments rather than creating parallel workstreams. Sentinel monitors AI-specific infrastructure: model endpoints, training pipelines, inference services, data stores containing training data, and access patterns to model weights. When Sentinel detects configuration drift on an AI workload (a model endpoint's access policy changes, a training data pipeline loses encryption, an inference service disables logging), it maps the drift to affected controls across every active framework simultaneously. Rampart re-evaluates posture scores for CMMC, FedRAMP, and AI governance in a single pass because the underlying control relationships are pre-computed through the derivation chain. Vanguard scans AI workloads for security vulnerabilities specific to machine learning systems: exposed model endpoints, training data access patterns, prompt injection vectors, and insecure model serialization. Citadel surfaces the highest-priority findings across all frameworks in a unified action queue. One governance posture. Every framework, including AI governance, computed from the same evidence.
Something is being forged.
The full platform is under active development. Reach out to learn more or get early access.