CyberSecEval 2 – A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models
With the speed at which the generative AI space is moving, we believe an open approach is an important way to bring the ecosystem together and mitigate potential risks of Large Language Models (LLMs). Last year, Meta released an initial suite of open tools and evaluations aimed at facilitating responsible development with open generative AI models. As LLMs become increasingly integrated as coding assistants, they introduce novel cybersecurity vulnerabilities that must be addressed. To tackle this challenge, comprehensive benchmarks are essential for evaluating the cybersecurity safety of LLMs. This is where CyberSecEval 2, which assesses an LLM’s