VeriTrail: Detecting hallucination and tracing provenance in multi-step AI workflows

Many applications of language models (LMs) involve generating content based on source material, such as answering questions, summarizing information, and drafting documents. A critical challenge for these applications is that LMs may produce content that is not supported by the source text – a phenomenon known as “closed-domain hallucination.”1
Existing methods for detecting closed-domain hallucination typically compare a given LM output