London

June 28–29, 2027

New York

September 15–16, 2026

Berlin

November 9–10, 2026

Shipping secure, reliable and high performance AI agents

A hands-on engineering guide to building AI agents that stay secure, reliable, and fast in real, high-stakes production systems.

Register or log in to access this video

Create an account to access our free engineering leadership content, free online events and to receive our weekly email newsletter. We will also keep you up to date with LeadDev events.

Register with google

We have linked your account and just need a few more details to complete your registration:

Terms and conditions

 

 

Enter your email address to reset your password.

 

A link has been emailed to you - check your inbox.



Don't have an account? Click here to register
June 02, 2026

We dive into the technical details of three areas in the context of shipping AI agents: security, reliability and performance.

This is a talk written by engineers, for engineers. At Gradient Labs we build AI agents for financial services companies. In this talk, we share techniques and best practices that we’ve developed in shipping AI agents in high-stakes environments. Specifically, we cover three areas, outlined below.

Security

We talk about how we apply the principle of least privilege to AI agents, as well as some specific practical examples where we’ve had to think through security (e.g. identity verification with voice agents).

Reliability

We talk about how we handle failures, rate limits, latency spikes, durable execution and retries, as well as how we protect our agent from external factors like spikes in load.

Performance

We talk about testing non-deterministic systems using production conversations as test fixtures, LLM-as-judge evaluation, and the metrics that actually predict customer satisfaction.

Key takeaways

  • How to manage the risk of AI agents reading sensitive data and performing high-stakes actions
  • How to ensure reliable execution in the presence of failures when working with AI agents
  • How to test AI agents when traditional unit tests don’t work; synthetic evaluation pipelines, LLM-as-judge, and metrics that catch regressions before customers do