This talk explores the technical, organizational, cultural, and psychological factors that matter when we choose between full rewrites or incremental change.
This talk will cover how we organized the work -- human, technical, and organizational -- needed to prevent outages while we strove to keep ahead of pandemic-driven explosive product growth, and we’ll apply it to future long-running, large-scale incidents.