New York

October 15–17, 2025

Berlin

November 3–4, 2025

London

June 2–3, 2026

From evals to experiments: How to ship successful AI initiatives by failing cheaply

Discover practical ways to blend qualitative evals with quantitative experiments to reduce risk, improve learning speed, and deliver impactful AI projects.

Speakers: Ryan Lucht

Register or log in to access this video

Create an account to access our free engineering leadership content, free online events and to receive our weekly email newsletter. We will also keep you up to date with LeadDev events.

Register with google

We have linked your account and just need a few more details to complete your registration:

Terms and conditions

 

 

Enter your email address to reset your password.

 

A link has been emailed to you - check your inbox.



Don't have an account? Click here to register
November 14, 2025

How to use both evals and experiments in your AI development lifecycle to avoid costly mistakes and ship more successful projects.

AI initiatives are anything but a slam dunk: projects can be expensive, hard to measure, and face many failures along the way. The key to success lies in building the capability to learn quickly and fail cheaply. By going beyond upstream, qualitative evals and incorporating downstream, quantitative experiments, teams can shorten feedback loops and allow for rapid course correction. In this talk, Datadog Senior Technical Advocate Ryan Lucht will show teams how to leverage both approaches and join model performance with business metrics to ship more successful AI initiatives.

Key takeaways

  • Running experiments at scale with basic and advanced A/B testing approaches for AI development
  • How to handle the low success rate of AI initiatives by failing cheaply
  • The different purposes of evals and experiments – and why you need both
  • Using error analysis to define and measure evaluators

Promoted Partner Content