VeriTrail: Detecting hallucination and tracing provenance in multi-step AI workflows

Many applications of language models (LMs) involve generating content based on source material, such as answering questions, summarizing information, and drafting documents. A critical challenge for these applications is that LMs may produce content that is not supported by the source text – a phenomenon known as “closed-domain hallucination.”1 Existing methods for detecting closed-domain hallucination typically compare a given LM output  

Read more

Navigating medical education in the era of generative AI

[THEME MUSIC FADES]   The book passage I read at the top is from Chapter 4, “Trust but Verify.” In it, we explore how AI systems like GPT-4 should be evaluated for performance, safety, and reliability and compare this to how humans are both trained and assessed for readiness to deliver healthcare.  In previous conversations with guests, we’ve spoken a lot about AI in the clinic as well as in labs and companies developing AI-driven tools. We’ve also talked about AI […]

Read more

Xinxing Xu bridges AI research and real-world impact at Microsoft Research Asia – Singapore

AI has made remarkable progress in recent years, but turning experimental models into tools that work in the real world is still a major challenge. Bridging this gap between innovation and application has shaped the career of Xinxing Xu, principal researcher at Microsoft Research Asia – Singapore, and underpins the mission of the lab’s newly established presence in the region. Xinxing Xu, Principal Researcher, Microsoft Research Asia – Singapore “Innovative algorithms can only demonstrate their true value when tested with […]

Read more

AI Testing and Evaluation: Reflections

AMANDA CRAIG DECKARD: Thank you so much. SULLIVAN: In our intro episode, you really helped set the stage for this series. And it’s been great, because since then, we’ve had the pleasure of speaking with governance experts about genome editing, pharma, medical devices, cybersecurity, and we’ve also gotten to spend some time with our own Microsoft responsible AI leaders and hear reflections from them. And here’s what stuck with me, and I’d love to hear from you on this, as […]

Read more

AI Testing and Evaluation: Learnings from cybersecurity

CIARAN MARTIN: Well, thanks so much for inviting me. It’s great to be here. SULLIVAN: Ciaran, before we get into some regulatory specifics, it’d be great to hear a little bit more about your origin story, and just take us to that day—who tapped you on the shoulder and said, “Ciaran, we need you to run a national cyber center! Do you fancy building one?” MARTIN: You could argue that I owe my job to Edward Snowden. Not an obvious […]

Read more

How AI will accelerate biomedical research and discovery

Daphne Koller is the CEO and founder of Insitro, a machine learning-driven drug discovery and development company that recently made news for its identification of a novel drug target for ALS and its collaboration with Eli Lilly to license Lilly’s biochemical delivery systems. Prior to founding Insitro, Daphne was the co-founder, co-CEO, and president of the online education platform Coursera. Noubar Afeyan is the founder and CEO of Flagship Pioneering, which creates biotechnology companies focused on transforming human health and […]

Read more

AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

DANIEL CARPENTER: Thanks for having me.  SULLIVAN: Dan, before we dissect policy, let’s rewind the tape to your origin story. Can you take us to the moment that you first became fascinated with regulators rather than, say, politicians? Was there a spark that pulled you toward the FDA story?  CARPENTER: At one point during graduate school, I was studying a combination of American politics and political theory, and I did a summer interning at the Department of Housing and Urban Development. And I began to think, why […]

Read more
1 2 3 4 21