As AI systems become integral to production software, ensuring their reliability and maintainability is no longer optional—especially when these systems interact with users, tools, or external APIs. Yet, testing AI-driven applications often feels like uncharted territory for developers used to more traditional systems.
In this talk, we’ll demystify the testing landscape for AI agents and workflows. We’ll walk through practical strategies for writing unit and functional tests tailored to AI behaviors. We'll dive deeper into robust integration testing, focusing on techniques like interaction recording and mocking to ensure secure, deterministic, and efficient validation. We’ll also revisit traditional AI testing methods—like evaluation datasets and metric-based validation—to show how they complement system-level testing.
Attendees will leave with concrete techniques to build test suites that make evolving AI systems safer and more sustainable—without slowing down iteration speed.
In this talk, we’ll demystify the testing landscape for AI agents and workflows. We’ll walk through practical strategies for writing unit and functional tests tailored to AI behaviors. We'll dive deeper into robust integration testing, focusing on techniques like interaction recording and mocking to ensure secure, deterministic, and efficient validation. We’ll also revisit traditional AI testing methods—like evaluation datasets and metric-based validation—to show how they complement system-level testing.
Attendees will leave with concrete techniques to build test suites that make evolving AI systems safer and more sustainable—without slowing down iteration speed.