Berlin

November 4 & 5, 2024

New York

September 4 & 5, 2024

Nick Stenning

Nick Stenning is a Site Reliability Engineer at Microsoft, working on Azure. He previously worked at the UK's Government Digital Service and the startup Travis CI.

Nick Stenning is a Site Reliability Engineer at Microsoft, working on Azure. He previously worked at the UK’s Government Digital Service and the startup Travis CI. He’s been talking his colleagues’ ears off on the topic of post-incident review for close to a decade.

Learning from incidents: from 'what went wrong?' to 'what went right?'

When things go wrong, we tend to focus on mistakes, miscalculations, and deficiencies in design. By limiting our investigations to the details of what went wrong, we ignore a far richer and more interesting source of learning: how things went right.

Video