Berlin

November 4 & 5, 2024

New York

September 4 & 5, 2024

Breaking down our understanding of system resilience

How confident are you in your prod servers staying up without your help?

Speakers: Will Gallego

Register or log in to access this video

Create an account to access our free engineering leadership content, free online events and to receive our weekly email newsletter. We will also keep you up to date with LeadDev events.

Register with google

We have linked your account and just need a few more details to complete your registration:

Terms and conditions

 

 

Enter your email address to reset your password.

 

A link has been emailed to you - check your inbox.



Don't have an account? Click here to register
July 21, 2020

How confident are you in your prod servers staying up without your help? Too often in tech we mistakenly interchange three important concepts when describing our socio-technical systems: how resilient they are, the reliability they exhibit in day to day work, and how robust they are under duress. Though interrelated, they are not equivalent.

How can we successfully gain insights in post-incident reviews, execute chaos engineering experiments, and build scalable infrastructure if we’re misinterpreting our approaches? By separating out these core concepts, we can isolate better approaches in adapting to unforeseen circumstances. We’ll look at common misconceptions when describing our systems as resilient and focus on proven methods to help us improve our understanding of our systems.