PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays
In our ever-evolving journey to enhance healthcare through
Read moreDeep Learning, NLP, NMT, AI, ML
In our ever-evolving journey to enhance healthcare through
Read more[MUSIC ENDS] For our introductory episode, I’m pleased to welcome Amanda Craig Deckard from Microsoft to discuss the company’s efforts to learn about testing in other sectors. Amanda is senior director of public policy in the Office of Responsible AI, where she leads a team that works closely with engineers, researchers, and policy experts to help ensure AI is being developed and used responsibly. Their insights shape Microsoft’s contribution to public policy discussions on laws, norms, and standards for AI. […]
Read moreAs generative AI becomes more capable and widely deployed, familiar questions from the governance of other transformative technologies have resurfaced. Which opportunities, capabilities, risks, and impacts should be evaluated? Who should conduct evaluations, and at what stages of the technology lifecycle? What tests or measurements should be used? And how can we know if the results are reliable? Recent research and reports from Microsoft (opens in new tab), the
Read moreWe are excited to share our first big milestone in solving a grand challenge that has hampered the predictive power of computational chemistry, biochemistry, and materials science for decades. By using a scalable deep-learning approach and generating an unprecedented quantity of diverse,
Read moreArtificial intelligence is advancing across a wide range of fields, with one of the most important developments being its growing capacity for reasoning. This capability
Read moreThe book passage I read at the top is from “Chapter 10: The Big Black Bag.” In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant […]
Read moreOne of the key use cases for generative AI involves answering questions over private datasets, with retrieval-augmented generation (RAG) as the go-to framework. As new RAG techniques emerge, there’s a growing need to benchmark their performance across diverse datasets and metrics. To meet this need, we’re introducing BenchmarkQED, a new suite of tools that automates RAG benchmarking at scale, available on
Read moreThe book passage I read at the top is from “Chapter 4: Trust but Verify,” which was written by Zak. You know, it’s no secret that in the US and elsewhere shortages in medical staff and the rise of clinician burnout are affecting the quality of patient care for the worse. In our book, we predicted that generative AI would be something that might help address these issues. So in this episode, we’ll delve into how individual performance gains that […]
Read moreALEX LU: Yeah, I’m really excited to be joining you today. HUIZINGA: So let’s start with a little background of your work. In just a few sentences, tell us about your study and more importantly, why it matters. LU: Absolutely. And before I dive in, I want to give a shout out to the MSR research intern who actually did this work. This was led by Kasia Kedzierska, who interned with us two summers ago in 2023, and she’s the […]
Read moreThis is such exciting work about environmental forecasting, so we’re happy to have the two of you join us today. Megan and Wessel, welcome. MEGAN STANLEY: Thank you. Thanks. Great to be here. WESSEL BRUINSMA: Thanks. TINGLE: Let’s jump right in. Wessel, share a bit about the problem your research addresses and why this work is so important. BRUINSMA: I think we’re all very much aware of the revolution that’s happening in the space of large language models, which have […]
Read more