AI Red Teaming - Why and How to Jailbreak LLM Agents

Explore the critical security vulnerabilities of AI agents through this 11-minute conference talk that demonstrates how adaptive, multi-turn attacks can compromise LLM systems and reveals essential defense strategies. Learn why traditional static testing methods are insufficient against evolving threats as Alex Combessie from Giskard explains how attackers exploit conversational consistency and short-term context to manipulate AI models. Discover the foundations of AI Red Teaming and understand how malicious actors leverage LLMs' reliance on context and conversational flow to execute sophisticated jailbreak attacks. Gain practical insights into implementing continuous, automated red teaming approaches combined with human-in-the-loop monitoring to identify and neutralize emerging security risks before they impact production systems. Master strategies for integrating robust oversight mechanisms that ensure comprehensive AI security in an era where AI agents are becoming increasingly capable and consequently more vulnerable to sophisticated attack vectors.