‍

Embedding Trust: A New Model for Detecting LLM Hallucinations

AI research papers

June 16, 2026

Vijil

Share this article

Embedding Trust: A New Model for Detecting LLM Hallucinations

In domains like medicine, law, policy, or science , organizations are beginning to rely on large language models (LLMs) to summarize documents, answer complex questions, and support decision-making workflows.

But there’s a problem: LLM systems are prone to hallucinations, or producing non-factual responses. Most existing approaches to long-form factuality rely on claim-by-claim fact-checking: decompose the response into atomic statements, retrieve evidence for each through individual assessment of factuality, and aggregate. This costs many LLM calls per response, compounds errors for long-form responses, and breaks down at scale because of brittleness.

Also, a single paragraph can contain multiple statements, cross-references, and claims without natural truth values, making it difficult to assign a single accuracy rating using a one-by-one evaluation method.

Vijil Chief Scientist and fundamental research team lead Tim G. J. Rudner along with collaborators Dhrupad Bhardwaj and Julia Kempe published a paper at the International Conference on Machine Learning (ICML) 2026 that proposes a new way to evaluate long-form response factuality more efficiently and with a higher degree of accuracy.

In “Embedding Trust: Semantic Isotropy Predicts Nonfactuality in Long-Form Text Generation”, the researchers introduce the concept of ‘semantic isotropy,’ a geometric measure of dispersion to develop a lightweight tool to assess whether a long-form LLM response is trustworthy and factual.

By comparing how even few responses compare to each other based on ‘semantic isotropy’ that evaluates how LLM response embeddings cluster - or not - the system can efficiently and reliably evaluate factuality for long-form responses. The method they propose is model-agnostic, label-free, hyperparameter-free, and cheap but still tracks factuality.

ICML ranks among the top AI venues globally and is known for debuting many of the breakthroughs that have shaped modern AI. Each submission undergoes a months-long, double-blind review and rebuttal process, making acceptance highly selective - and pointing to the impact of accepted papers.

The Challenge: Evaluating Long-Form LLM Responses for Factuality

Hallucination detection has become one of the defining challenges in enterprise AI deployment. While significant effort has been made over the past several years to lower hallucination rates—the frequency at which these models generate incorrect information—much of the focus on hallucination detection has remained limited to short-form responses.

Most hallucination detection methods today work by checking factual claims one by one, and typically use the LLM as a judge methodology, with commonly used benchmarking tasks focusing on short-form or multiple-choice questions to assess a model's general capabilities. Although these benchmarks are helpful for gauging basic performance, they are not particularly effective for determining the reliability of the complex, conversational responses users typically receive when interacting with an LLM – which involve long-form responses and may contain dozens of claims spread across multiple paragraphs, making claim-by-claim verification slow, brittle, and computationally expensive.

Introducing Semantic Isotropy

The novelty of the Semantic Isotropy approach is to avoid evaluating the text directly and instead focus on the "embedding", or internal mathematical representation of the model's output, to perform comparison.

Embedding is a core internal functional process for LLMs and involves transforming raw text into high-dimensional vectors that are not readily interpretable by humans but carry meaning as an LLM’s representation of the transformed text

When performing semantic isotropy scoring, multiple long-form responses to the same prompt are generated and their text embeddings are extracted, and a semantic isotropy score is estimated by measuring the angular dispersion of these responses on the unit sphere. In the paper, the authors show across datasets, long-form generator LLMs, and embedding models that greater dispersion across the samples correlates strongly with a higher likelihood of hallucinations or factual inconsistencies within the sample of long-form responses – making any one long-form response less trustworthy.

Computing the semantic isotropy score for a given sample of long-form responses takes under two seconds on an A100 GPU.

The process requires generating a few different samples (as few as two or three) of a long-form response for the same prompt to exploit the model's inherent stochasticity. By using open-weight embedding models, this method can be applied to any LLM used to generate long-form responses, including closed state-of-the-art models like Claude Opus, where internal representations are not directly accessible.

The method is well suited for high-stakes, safety-critical domains where users need feedback on the trustworthiness of a specific response without needing specialized training or a "golden data set" for every new application.

A Step Toward More Trustworthy AI

The broader significance of this research is not just the technique itself. It reflects a growing shift in how the AI industry thinks about trustworthiness.

Rather than treating hallucination detection as purely a fact-checking problem, researchers are increasingly exploring whether an LLM’s uncertainty can be inferred from its own internal behavior.

Semantic isotropy is a successful example of that trend.

The participation of Tim G. J. Rudner, Vijil’s Chief Scientist, in this research exemplifies Vijil’s investment in pushing the frontier of what’s possible for AI agent safety and security. Vijil has already expanded our evaluation techniques for evaluating responses for reliability beyond using LLMs as judges - including dedicated, specialized small language models (SLMs) with potential to incorporate semantic isotropy as a formal evaluation technique.

Vijil’s fundamental research team’s direct line of sight into scientific advances helps to both underpin a deeper understanding of the technical challenges and frame how to systematically and safely adopt new capabilities.

‍

Latest Blogs