For years there has been somewhat of a “hero” culture within the ops and incident world, there is always a person who has been at an org for years, knows how to do everything, and saves the day when we need them. Until the next incident.
While we always appreciate our heroes; in order to get out of the cycle of incidents and start learning from them, we have to start treating our post-incidents processes as a team sport. Learning from incidents can make a difference in our teams and companies by helping us turn our outages into opportunities to improve the way we do our work, understand pitfalls, and collaborate better as a team. This cannot be done by one single person though, the magic happens when you have a group of folks who are bought into it, do the work together, and want to learn because they know it makes a difference.
In this talk I will walk you through the different stages of a Post-Incident Process (based on the Howie guide released last year) and explain how to get folks from different parts of your org involved. You will understand how to gather insights from different points of view (not just from the person who fixed it), let folks share their narratives, and most importantly share the learnings to instill progress.
When we involve others we can then move our post-incident processes away from a template-filling post-mortem meeting that everyone dreads attending to collaborative reviews, living documents, and conversations that lead to change!