‍

How Much Trust Is Enough for AI Governance?

Expert Talks

June 8, 2026

Steve Coplan

Head of Marketing

Share this article

When organizations rush agentic AI into production, they keep bumping into the same uncomfortable question: how much trust is enough? In a recent webinar, How Much Trust Is Enough? Mastering Agentic AI Governance, AI security luminary Ken Huang and Vijil founder and CEO Vin Sharma worked through the question from first principles—and arrived at an answer from different perspectives.

The outcome of this conversation is a practical foundation for aligning the agentic threat model - what can go wrong - with the agentic trust model - how do we ensure that the agent is secure, safe, and reliable for its purpose - to establish a comprehensive governance story.

Both speakers emphasized that risk and trust are not static - it’s an ongoing exercise for systematically mapping threats, reducing uncertainty, and proving that an intelligent system can act with discipline, restraint, and accountability.

To watch the webinar in full, visit Vijil’s YouTube channel.

Trust Is Not a Vibe

Sharma opened by pushing back on how the industry currently thinks about trust. Too much of it, he argued, is "vibe"—vibe coding, vibe testing, vibe deployments—an attitude or gut feeling rather than something measurable. The more useful framing borrows from behavioral economics: trust exists when the expected reward of cooperation outweighs the risk of betrayal. Crucially, trust is not a one-time snapshot. It is earned through repeated interactions between a principal and an agent, and the scale tips up or down with every transaction.

That framing yields three measurable dimensions. Reliability asks whether the agent consistently does what it is supposed to do, even under varying conditions. Security asks whether it can protect itself from attack. Safety asks whether it might cause harm to others. If you can put numbers to these three things, Sharma said, you can systematically improve them over time—not just before deployment, but continuously after it. That ambition is the heart of what Vijil calls "trust infrastructure": moving trust from vibe to something engineered, measurable and quantifiable on a continuous basis.

A New Threat Model

Huang approached the same problem from the bottom up through his Maestro framework, an OSI-style, seven-layer reference architecture for the agentic AI stack. It spans the foundation model, the data layer, the agentic framework, the deployment infrastructure, evaluation and observability, the security agents that govern other agents, and finally the broader agent ecosystem where agents talk to one another over protocols like MCP and A2A.

Each layer carries its own risk profile. Huang invoked Simon Willison's "lethal trifecta"—an agent with access to private data, exposure to untrusted external content, and the ability to take actions—and warned that removing just one leg isn't a reliable fix. Even an agent confined to internal data with the power to act can delete a database or leak information.

A recurring theme was that the lower layers are familiar territory. Network security, container orchestration, and data governance are problems many teams already know how to handle. The challenge concentrates in the upper layers—observability, agent-governing-agent security, and multi-agent ecosystems—where no single organizational role yet has full coverage and direct responsibility.

The New Failure Modes

Sharma illustrated why those upper layers matter with a striking example: collusion between agents. Drawing on research published by researchers at Cambridge University, he described how an orchestrated multi-agent system left to its own devices can develop an emergent incentive for a saboteur agent to preferentially route its work to a like-minded reviewer, so malicious code slips through, and the two quietly leave open the backdoor. You can’t catch that at the network level. This isn’t a prompt injection that your guardrails will detect. There’s no single point of failure that a scanner will flag.

As he put it, the challenge is no longer protecting your grandfather's database—it's protecting your grandfather. These risks are invisible at the network or host level.

So, How Much Is Enough?

Even a disciplined approach—threat-modeling each layer, then governing against a framework like the NIST AI RMF—reduces risk rather than removing it, and Huang was blunt about what that means for the central question.

"This approach can reduce your attack surface and improve trust—it can't eliminate the risk. So go back to our original topic: what is enough? There's never really 'enough.' It's always a process, a roadmap—you improve your risk posture, and you increase the trust." — Ken Huang

Both speakers converged on the same conclusion: there is no fixed "enough." Compliance as a checkbox before launch is not sufficient. As Huang noted, current GRC programs built for shadow IT don't translate cleanly to agentic AI, and frameworks like the NIST AI RMF, OWASP guidance, and the EU AI Act help only as a starting roadmap. Trust is a continuous posture, not a milestone.

This is why both favored measurable, living scores. Huang pointed to the emerging OWASP AI Vulnerability Scoring System. Sharma described Vijil's agent trust score, modeled loosely on a FICO score—a calculated value that updates with every transaction, capturing not just the score but the trust delta.

Vijil tracks three P's: the agent's purpose, the organization's policies, and the personas that might interact with it, from a well-meaning grandmother to a sophisticated state actor. They also stressed treating agent onboarding like a new employee's: start with minimal access and earn the keys gradually, anchored to a strong cryptographic identity such as SPIFFE/SPIRE.

In combination, one gives us the threat model; the other gives us the trust model. Together, they turn agentic AI from an impressive capability into a governed, credible, and responsible system.

The takeaway both speakers landed on is that a truly trustworthy agent metabolizes chronic stress into resilience—absorbing novel attacks and reducing failure modes faster over time. Enough trust, in other words, is never a destination. It's a discipline.

‍

Latest Blogs