New York

October 15–17, 2025

Berlin

November 3–4, 2025

London

June 2–3, 2026

Detect malicious attacks at 84M RPS with ML under 500us

A deep dive into the high-performance, distributed architecture and machine learning optimizations Cloudflare uses to detect malicious attacks at a global scale with sub-millisecond latencies.

Speakers: Denzil Correa

Register or log in to access this video

Create an account to access our free engineering leadership content, free online events and to receive our weekly email newsletter. We will also keep you up to date with LeadDev events.

Register with google

We have linked your account and just need a few more details to complete your registration:

Terms and conditions

 

 

Enter your email address to reset your password.

 

A link has been emailed to you - check your inbox.



Don't have an account? Click here to register
November 14, 2025

A deep dive into the high-performance, distributed architecture and machine learning optimizations Cloudflare uses to detect malicious attacks at a global scale with sub-millisecond latencies.

In a world where the scale of internet traffic is constantly growing, how do you protect against malicious attacks when you’re handling over 84 million requests per second? This talk will pull back the curtain on Cloudflare’s approach to high-throughput, low-latency threat detection. We’ll explore our distributed architecture that spans data centers worldwide, and how we leverage Rust for extreme efficiency and resource optimization, where every CPU cycle and byte of memory counts. We will delve into the smart processing techniques that are critical to our success, including advanced caching strategies and hardware tuning. A significant portion of the talk will focus on the machine learning models at the core of our Web Application Firewall (WAF), and the extensive optimizations we’ve implemented, from SIMD and TensorFlow Lite upgrades to LRU caching, to make them incredibly fast. We will also touch on our innovative data generation and sampling strategies that are key to training accurate and resilient AI models.

Key takeaways

  • Architecting for Scale: Learn how to design and build distributed systems capable of handling massive volumes of traffic with low latency.
  • Performance Optimization: Discover practical techniques for optimizing software and hardware, including the benefits of using Rust and advanced caching.
  • ML in Production: Understand the challenges and solutions for deploying and optimizing machine learning models in a high-stakes, real-time environment.
  • Actionable Scaling Strategies: Gain insights into how to leverage serverless and distributed architectures to scale your own applications and ML algorithms.

Promoted Partner Content