New York

October 15–17, 2025

Berlin

November 3–4, 2025

London

June 2–3, 2026

Scaling AI for high-stakes, real-time payments

What does it take to run real-time AI in Stripe’s core payment flow? This is the story of scaling ML in a latency-critical system used by billions and built by multiple engineering orgs.

Speakers: Prasad Wangikar

Register or log in to access this video

Create an account to access our free engineering leadership content, free online events and to receive our weekly email newsletter. We will also keep you up to date with LeadDev events.

Register with google

We have linked your account and just need a few more details to complete your registration:

Terms and conditions

 

 

Enter your email address to reset your password.

 

A link has been emailed to you - check your inbox.



Don't have an account? Click here to register
November 04, 2025

As a Principal Engineer in Stripe’s Payment Intelligence org, I’m responsible for scaling real-time AI systems that detect and prevent fraud—without adding latency to Stripe’s most critical code path, known as the “charge path”.

At the heart of Stripe’s infrastructure, the “charge path” is a shared Ruby monolith that processes almost every transaction across the platform. In 2024 alone, it handled over $1.4 trillion in volume. It’s one of the most latency-sensitive and business-critical systems at Stripe, actively maintained by several engineering orgs and touched by hundreds of developers. Making changes here requires precision, coordination, and a deep understanding of systemic constraints.

Our org had a strong roadmap of AI innovations to improve fraud prevention and authorization rates. But executing that vision meant working through legacy architectural decisions and Ruby’s lack of native concurrency—all without exceeding strict latency constraints. To move forward, I brought together tech leads and domain experts from across Stripe for a virtual onsite. We ran an event storming workshop to surface shared constraints, uncover tribal knowledge, and identify architectural leverage points for introducing safe and meaningful parallelism. I proposed a series of deep but targeted architectural improvements to unlock more ML execution within the charge path, minimizing the risk. I then collaborated across teams to build a multi-quarter roadmap that aligned efforts, minimized conflicts, avoided throwaway work, and accelerated the landing of impact.

This talk is about what it takes to deliver AI innovation inside a high-stakes, real-time system—navigating legacy architecture, aligning across teams, and scaling machine learning under strict performance constraints.

Key takeaways

  • How to scale real-time AI in a latency-sensitive system
  • How to introduce parallelism in a legacy Ruby monolith
  • How to uncover system constraints through structured collaboration
  • How to align multiple teams around architectural change