Every engineer is familiar with the concept of technical debt. But as engineering leaders, how do we address the debt that evolves in our organizational processes?
Technical debt has a less recognized cousin: organizational debt. Much like how we build software with the constraints of the moment, we build our organizations, processes, and structures in the same way. What worked for the team yesterday doesn’t always work for the team today or tomorrow.
How organizational debt accrues
There’s a pearl of wisdom in the startup world that you need to be comfortable doing things that don’t scale. That can involve throwing people at a problem rather than taking the time to build a system, or having the same single point of failure on multiple different systems. Sometimes this means accruing technical debt by building software speedily without much thought of maintainability or future flexibility, and sometimes this means creating processes that only work under a narrow set of circumstances. As Forbes notes, “Organizational debt is all the people/culture compromises made to ‘just get it done’ in the early stages of a startup.” This also tends to be a more invisible area of debt.
A very common source of organizational debt is a lack of formal structures, potentially manifesting as flat management layers, a lack of engineering levels, or ad hoc incident response management. These gaps often develop during the nascent company stages where individuals push forward without foreseeing or knowing what the system will eventually require in the years ahead. This is a worthwhile tradeoff, but one that will bear costs down the road as the organization grows.
Release management oversees all the stages involved in shipping software, from development and testing to deployment, and this is a common area of organizational debt. Most companies begin with a basic process: an engineer logs into the production server and runs a git pull. This process may last for a week, or it may last for years, but it’s not a sustainable method. It does, however, get code out in the world. And so, this “shortcut” sows the seeds of organizational debt.
Inflection points
At some point, these shortcut processes begin to show their cracks. Not everyone will agree, but spidey senses will start tingling at various places in the organization. Even if folks can't pinpoint the issue, there will be a general feeling that things aren't working as they should.
This accrual of debt and the sense that things aren’t working often comes as a result of growth. The systems that support an engineering team of five don’t necessarily support an engineering team of 50.
To continue on with our specific example of the release process, at this point, it’s a few years down the line. No one is releasing code by manually running git pull, but at the same time, there isn’t a lot of consistency across engineering teams for this process. Different teams are running different tools for CI/CD, some teams have settled into a weekly release process, and some teams are still just flying by the seat of their pants. At some point, some big error gets shipped, which makes everyone sit up and ask how we got to this point. That’s when a declaration gets made that things need to change.
Avoiding overcorrection and finding the right balance
Once that declaration happens, the most common reaction is to swing the pendulum hard in the other direction. In my example, the release error makes it everyone’s biggest priority to create an airtight, ubiquitous release process, and when people are reacting to a problem, the goal is always to prevent the bad outcome from reoccurring. But with such strong overcorrection, it’s easy to suddenly end up in a place where code needs to get signed off by both a platform reviewer and a team reviewer, scheduled for release during a specific window, and pushed to production by someone with distinct training and responsibilities.
With too many hoops to jump through, the process could become burdensome for engineers. This reflex also misses how much it’s impacting all of the outcomes adjacent to that one bad experience: it slows deployments, causes context switching for more people, and in general makes the process of releasing software more burdensome.
As leaders, we need to be just as comfortable cutting back (or avoiding!) processes as we are adding new, or necessary ones. The “right” processes will look different from company to company. In some businesses, it might be that the two reviewers really do prevent more errors with minimal additional development time, but that the release windows are making things take an additional two days to get to production on average. All of this needs to be considered to avoid burdening the engineering teams with an ever-increasing list of processes.
Final thoughts
Dealing with organizational debt is a never-ending struggle. There will never be a time when everyone in the company looks around and agrees that there is the perfect amount and form of process and structure. As engineering leaders, we need to be constantly righting the ship, adding processes where things have gone too wild, and stripping away systems that have become more burdensome than useful.