Join 250,000+
professionals today
Add Insights to your inbox - get the latest
professional news for free.
Join our 250K+ subscribers
Join our 250K+ subscribers
Subscribe11 MAR 2026 / EXPERT INSIGHTS
MIT research revealed that around 95% of generative AI systems fail to deliver expected business value due to poor governance, overlooked controls and unexamined assumptions. This issue could lead to significant consequences for organizations since decisions are often based on the outputs of these systems; if there is weak governance, reliability is uncertain, increasing risk.
According to recent MIT research, approximately 95 percent of generative AI pilots fail to deliver measurable business value. In many cases, technology works, but governance does not. If your organization is investing heavily in AI initiatives, what is your plan to ensure you are among the five percent that succeed? Consider a scenario that may sound familiar. An AI system produces an incorrect answer. The immediate reaction is often to blame technology. But here's the twist: the real culprit is usually much less mysterious. It is typically weak governance, missing controls, and unexamined assumptions. Think about it this way. From an internal control perspective, AI systems are just automated processes that turn inputs into outputs. Those outputs drive decisions. When the process lacks proper governance, the reliability of those outputs becomes uncertain, and the consequences can be significant. Without guardrails, even the most sophisticated system can steer an organization off course.
AI is not fundamentally different from the automated systems organizations have relied on for decades. Financial systems calculate balances. Inventory systems track quantities. Payroll systems crunch compensation numbers. Each depends on controls to ensure accuracy and reliability. AI systems need that same discipline. The technology might be newer, but the governance principles are timeless, and seasoned auditors already know the playbook.
Auditors evaluate systems by asking a familiar set of questions. Who owns this process? What controls are in place? How do we approve changes? How do we validate outputs? How do we catch and fix errors? These questions work just as well for AI. The challenge is that many organizations have not applied them consistently in this context. They are treating AI as something exceptional when it should be treated as something accountable, a system that answers to the same governance standards as everything else.
When AI systems fail, investigations usually reveal the same old governance gaps we have seen in traditional systems. Oversight might be fuzzy. Controls might be missing or untested. Model assumptions may never have been documented. Outputs might get accepted without validation. None of these are purely technical problems. They are governance problems. And governance problems let technical issues fester undetected, sometimes for a very long time, quietly compounding risk in the background.
The good news is that the profession already has the tools to tackle these risks. What is needed now is the willingness to use them. These gaps align with established frameworks like COSO and ERM, plus emerging standards such as ISO 42001, NIST AI RMF, the EU AI Act, and DORA. The frameworks already exist. We just need to apply them.
One of the most common governance failures? Weak oversight. Many organizations treat AI initiatives like technical experiments rather than operational systems. Responsibility often lands with technical teams without clear executive accountability. That gap creates risk, often in ways that slip by unnoticed until something goes wrong.
Here is the reality. Oversight requires more than technical know-how. It needs:
Without these, AI systems can run for extended periods without meaningful supervision. Quietly producing outputs that nobody is formally responsible for validating. In any traditional control environment, that would raise eyebrows.
Boards and senior management often assume technical teams can handle AI risks independently. It is an understandable assumption. But it is misplaced. Technical expertise is essential. But it does not replace governance. Technical teams build and maintain systems. Management ensures those systems operate within acceptable risk limits. Different responsibilities. Both necessary. Though this distinction sometimes gets lost when technology takes center stage.
Weak oversight lets problems linger. A model might produce inaccurate results for months before anyone reviews its performance. By the time the issue surfaces, the downstream impact could already be significant:
Nobody wants to discover that kind of surprise after the fact.
From an internal audit perspective, weak oversight looks like any other entity-level control deficiency. Significant processes need clear ownership and regular review. AI systems are significant processes in many organizations today. They should not be treated as exceptions to principles that apply everywhere else. No matter how advanced the technology appears.
Another common gap? Missing or ineffective controls. Many AI systems go live without the controls that would normally surround any automated process, often because the people deploying them don't think of them in those terms, even though the risks follow a familiar pattern.
The examples aren't obscure:
Individually, each of these practices would raise immediate concerns in a traditional systems environment where the kind of red flag auditors are trained to spot quickly.
Changes to accounting systems typically require approval and testing. Data inputs get validated before anyone relies on them. Outputs get reconciled against expectations. These aren't bureaucratic formalities they're the mechanisms that give organizations confidence in what their systems produce. AI systems frequently bypass these expectations, not because the risks are smaller, but because the technology gets perceived as "innovative" rather than "operational."
AI systems process large volumes of data quickly, which means errors don't stay contained. A flawed assumption or corrupted data source can affect thousands of transactions before anyone notices something's wrong. By then, cleanup becomes far more complicated.
Consider an AI model estimating expenses or accruals. If it relies on incorrect assumptions and nobody performs reconciliation, those errors flow straight into financial statements. The problem isn't the model itself it's the absence of controls that would have caught the issue before it mattered. That's the kind of safeguard every control framework expects to see.
From a governance standpoint, AI systems should face the same control expectations as any other automated process. Change management, input validation, output reconciliation, and ongoing monitoring aren't optional features they're baseline requirements. The sophistication of the underlying technology doesn't change that. (The EU's Digital Operational Resilience Act DORA actually makes change management binding for financial entities' ICT systems, including AI.)
Every AI model relies on assumptions about data selection, variable relationships, optimization goals. These assumptions determine how the model behaves and what results it produces. In many ways, they form the foundation of the entire system, even if that foundation stays invisible to most users.
When assumptions go undocumented, they become difficult to evaluate. Users might not understand the model's limitations or the conditions where it performs poorly. Over time, institutional knowledge erodes as staff change roles or leave. What was once understood by a small group eventually becomes understood by no one, creating confusion and risk.
Undocumented assumptions create hidden risks. A model can appear to function correctly even when its underlying logic is flawed or outdated. Without documentation, identifying error sources becomes a guessing game. Determining whether the model remains appropriate for its intended purpose becomes nearly impossible.
Documentation provides transparency and accountability. It lets reviewers understand how a model works and why developers made certain decisions. It also lets organizations assess whether assumptions remain valid as business conditions, regulations, and data environments change something every well-governed system must periodically revisit.
From an audit perspective, documentation isn't optional. Processes that can't be documented can't be audited effectively. AI systems are no exception. (ISO 42001 and the NIST AI RMF's Map function specifically require documented assumptions, design choices, and limitations.)
AI systems often produce results that look precise and authoritative. That appearance of precision can be misleading and it creates a governance risk organizations frequently underestimate, especially when outputs look polished and confident.
Overreliance on AI outputs is one of the less visible failures in this space. Users might accept results without question, even when those results conflict with other available information. Errors persist not because the system can't be challenged, but because nobody thinks of challenging it, and that quiet acceptance lets problems linger.
Professional skepticism is essential to governance. Outputs need evaluation and validation before they drive decisions. This applies whether those outputs come from humans or machines. In fact, AI's automated nature makes skepticism more important, not less, because errors can scale quickly and quietly.
Overreliance on automation isn't new. Auditors have long recognized risks associated with automated processes. AI amplifies those risks by making systems more complex and less transparent. You can usually trace the logic behind a spreadsheet calculation. The logic embedded in a machine learning model? Often not easily explained.
When users rely on AI outputs without validation, the failure isn't technological it's procedural. Effective governance requires that automated outputs be subject to review and challenge. That discipline doesn't happen automatically. It must be designed into the process from the start. (The EU AI Act requires human oversight for high-risk systems, with obligations largely applying from August 2026; NIST's Measure and Manage functions demand performance assessment and risk treatment.)
Remember when Air Canada's chatbot told a customer they could claim a bereavement fare discount retroactively? The airline later denied the request, citing incorrect information. The dispute landed before British Columbia's Civil Resolution Tribunal, which held the airline responsible and ordered compensation.
The lesson? The system deployed without validating responses or monitoring for errors. Automated communications carry the same obligations as human communications and organizations remain accountable for accuracy, even when the message comes from software.
Attorneys using an AI tool for legal research included citations to nonexistent cases in a court filing. When challenged, they submitted additional AI-generated material containing false information. The court sanctioned them.
This case set a precedent driving continued sanctions in 2025–2026 for unverified AI use in legal or professional filings (including recent Fifth Circuit cases).
People often call this "AI hallucination," but the core issue was skipped verification. Professional responsibility requires validating sources regardless of AI involvement. Auditors and risk professionals can apply the same principle: verify AI outputs before relying on them.
A healthcare algorithm identified patients for extra care using historical spending as a proxy for medical need. Populations with historically lower access had lower spending. The algorithm under-flagged these patients.
The algorithm worked exactly as designed. The failure? The assumption that spending reflected need. Undocumented assumptions went untested and unchallenged, producing harm that technical reviews alone couldn't catch.
AI doesn't reduce the need for governance it increases it. Systems that operate quickly and at scale need strong controls because errors can propagate across large data volumes before anyone notices something's wrong.
Auditors and risk professionals are well-positioned to assess AI governance. The underlying principles aren't new. Oversight, documentation, control activities, and monitoring are established disciplines. They don't become less relevant because technology gets more complex or systems become harder to explain.
When evaluating AI systems, focus on these key questions:
| Question | Why It Matters |
| Who's responsible for this system? | Establishes clear ownership |
| What decisions rely on its outputs? | Defines the risk exposure |
| What assumptions are embedded, and who validated them? | Surfaces hidden risks |
| How are changes controlled and approved? | Ensures process integrity |
| How are outputs verified before action? | Prevents overreliance |
| How is ongoing performance monitored? | Catches drift and degradation |
These questions mirror those in traditional audits and should be applied with the same consistency and rigor.
Organizations treating AI as a governance issue tend to manage it better. They establish clear ownership, implement appropriate controls, and monitor performance continuously. Organizations treating AI as purely technical often overlook these elements entirely and typically discover the cost only after significant problems materialize.
AI failures often get described as technical issues, but that view is incomplete. In many cases, the technology works exactly as designed. Failures happen because governance was insufficient or nobody was clearly responsible for it.
Weak oversight, missing controls, undocumented assumptions, and uncritical reliance on outputs let errors persist and spread. These aren't new problems they're the same deficiencies auditors have documented in traditional systems for decades. Technology has changed, but human accountability hasn't.
AI doesn't require new governance principles. It requires the consistent, honest application of existing ones. Complexity shouldn't deter basic questions:
Who owns this?
Who verifies it?
What happens if it fails?
These questions are exactly what's needed.
AI errors are symptoms. Governance gaps are the root cause. Addressing them isn't a technical task it's a leadership responsibility. For auditors and risk professionals, this is work you're already equipped to perform. Ready to assess your organization's AI governance? Start with the questions above. The answers might surprise you, and that's exactly the point.
Until next time…
Don’t forget to share this story on LinkedIn, X and Facebook
Subscribe now for $199 and get unlimited access to MYCPE ONE, from CPE credits to insights Magazine
📢MYCPE ONE Insights has a newsletter on LinkedIn as well! If you want the sharpest analysis of all accounting and finance news without the jargon, Insights is the place to be! Click Here to Join
You’ve reached the 3 free-content piece limit. Unlock unlimited access to all News & CPE resources.
Subscribe Today.
Already have an account?
Sign In