Learning from other domains to advance AI evaluation and testing

Illustrated headshots of the Guests from the limited podcast series, AI Testing and Evaluation: Learnings from Science and Industry

As generative AI becomes more capable and widely deployed, familiar questions from the governance of other transformative technologies have resurfaced. Which opportunities, capabilities, risks, and impacts should be evaluated? Who should conduct evaluations, and at what stages of the technology lifecycle? What tests or measurements should be used? And how can we know if the results are reliable?

Recent research and reports from Microsoft (opens in new tab), the

To finish reading, please visit source site

Learning from other domains to advance AI evaluation and testing

Leave a Reply Cancel reply