Register or log in to access this video
The gap between AI demo and AI product is full of hard engineering work. I’m sharing 10 years of experience involving scaling, orchestration and cost optimisation.
How much does it take to go from a shiny AI demo to a deployed application? Everyone’s excited about new AI capabilities, but few talk about the engineering work required to use ML models in production. Getting this hidden part right is what separates successful AI projects from failed ones. In my talk I’ll show what it takes to turn ML experiments into robust, scalable products.
At Chattermill we’ve spent 10 years bringing ML to production. I’ll share hard-won lessons on orchestration of a dozen self-hosted ML services. We’ve had to tame the unpredictability of LLMs at scale, bridge the velocity gap between engineering and data science, and drive down both processing time and infrastructure costs. Have you ever thought that innocent retries could cost you hundreds of dollars?
You’ll walk away with methods to iterate faster and deliver more reliably in a heterogeneous environment of data scientists and software engineers. Whether you are a tech lead, architect or manager, this talk will prepare you to ship real-world ML applications, not just experiments.
Key takeaways
- Awareness of the hidden effort required to bring ML to production and differences between regular software projects and ML ones
- Dealing with pace differences between data science and engineering
- Methods to iterate faster with ML products