Platform teams are important for establishing the foundational layers of a company’s product. How can yours be more effective?
Over the last couple of years, the concept of having platform teams in organizations has become very common. To define the scope of a platform team, we must first delineate a stream-aligned team’s responsibilities.
As per the Team Topologies book, a stream-aligned team is “aligned to a flow of work from (usually) a segment of the business domain.” Because these teams focus on a single stream of work, they can deliver value as quickly, safely, and independently as possible. By nature, they are very close to the customer.
The Team Topologies book defines platform teams as, “A grouping of other team types that provide a compelling internal product to accelerate delivery by stream-aligned teams.” Their main goal is to enable stream-aligned teams to deliver work with considerable autonomy. They contribute to a better end-user experience, making it cohesive across different segments or products.
The following seven core responsibilities of platform teams, therefore, entail collaboration with stream-aligned teams, the incremental building of the platform, usability, and working on support and maintenance.
1. Prioritization, objectives, and key results
Having control and clarity over what the team’s top priorities is important. Usually, platform teams run in Kanban mode, where work is prioritized and picked up as requests from other teams filter in. This approach may be helpful (to some degree) in informing how certain platforms run, but ruthless prioritization clearly lets teams know what the targets are for either sprints or quarters.
There are usually three sources of backlog for platform teams: product teams, stakeholders (engineering managers and other members of engineering leadership), and the platform team itself. How the backlog is prioritized would be specific to the individual team; it could be based on urgency, return on investment (ROI), or the impact of the feature. For example, a developer experience team’s prioritization would depend on how much time is saved when working on something, while for an infrastructure security team, priority would be determined by the criticality of associated risk.
Regular prioritization is therefore important and helps keep stakeholders informed on expectations; in turn, it allows the team to keep track of their progress. Defining objectives and key results (OKRs) would be most effective to help monitor progress in this way, as having clearly documented reference points will ultimately be instrumental in keeping on top of all tasks.
This doesn’t mean you should disregard important ad-hoc requests when they arise. Use your already-listed OKRs to inform how they should be dealt with and where they fall on the prioritization scale. If your team cannot accommodate the work, consider how it can be swapped in for something else.
2. Thinking consumer first
The main consumer market for platform teams tends to be developers. This means that, usually, platform teams knowingly or unknowingly expect their core audience to automatically understand certain concepts – and if they don't, to independently explore them further. Here, platform teams usually miss out on providing easily usable documentation, examples, or starting kits that can be very useful to the wider audience (i.e., how APIs or libraries should be used, or references to possible responses and error scenarios).
At a high level, if the following three aspects of the consumer-first approach are taken care of, it tremendously improves the overall adaptability
- Building the right level of consumer-friendly abstractions – This is an essential part of the tech design of platform teams as it helps to reduce end-user friction, leading to higher adaptability without enforcement. Furthermore, it reduces the need for platform teams to spend their time, either synchronously or asynchronously, debugging and getting things working for others.
- Taking backward compatibility into consideration – Since platform teams usually start parallel to, or a little later than, product teams, all artifacts and processes developed take place concurrently with actual product development. This makes thinking of backward compatibility a very important aspect. When this isn’t possible to fulfill, platform teams should provide an easy migration guide.
- Code auto-generation – Platform teams should focus on creating code templates for others, so they can get started with their libraries and features from the get-go. These templates should give product teams a working version (locally if possible) by making basic default assumptions if required. Configuration-driven overrides can then be used for customization. This reduces the barrier to entry and immediately gives a very positive impression to consumers.
3. Covering breadth and depth
One of the main aspects of a platform team is to identify the surface areas in the software development lifecycle (SDLC). Most platform teams usually focus on either the runtime or the build.
It is equally important to evaluate the system design’s integration capabilities at the planning, compile-time, provisioning, and deployment phase as this reduces the chances for errors. Additionally, incorporating platform teams in all parts of SDLC leads to a more natural integration in the end-to-end lifecycle of stream-aligned teams’ work processes.
To be able to identify the breadth and depth of the platform, the platform team should spend time with stream-aligned teams via demos, going through business requirements, or meeting with them on a regular basis. By doing this, platform teams will gain a clearer, overall picture of the product team’s scope, and the product team will also gain insight into which options are the most feasible for the platform team.
4. Building confidence via automation tests
Since platform teams work on independent libraries and modules, it is important to test changes, in an integrated fashion, before releasing. Hence, end-to-end automation tests become a very important aspect of platform building.
These automation tests – at the minimum – should cover positive test cases for the variations supported by the libraries, APIs, and modules that are released by the team. Different versions of these tests can exist. For example, you may have smoke tests running on every change made to team-owned components and regression tests running before the release of these components (libraries, APIs, and modules).
Deployed versions should always be immutable. To ensure that you’re shipping the most fine-tuned product, perform as many checks as possible before releasing the library – bug fixes can be applied at a later date if needed.
5. Release management
Usually, when we think about release management, the focus lies in deploying versions and providing high-level descriptions of changes. But, it’s equally important to maintain a detailed changelog or documentation which supports the release. Migration plans for any major changes should also go hand-in-hand with releases. When released versions include breaking changes, adding a migration plan detailing how to move from the current version to the new one should be included with other release documentation.
Ahead of the launch, it can help to add previews, gifs, sample videos, and pre-release code snippets over communication channels like Slack or email to build excitement. This is a big plus and can be used to give engineers a sneak peek of what’s coming. It’s also a great idea to have a roadmap publicly available so that consumers can submit feature requests (or upvote existing ones), check if a feature is prioritized, and when it’s scheduled to be released.
6. Establishing a strong support workflow
Providing support to stream-aligned teams is a crucial aspect of platform teams. It’s good practice to track support and sprint work separately, as support often takes the form of ad-hoc tasks and sprints require some more extensive planning. Keeping these two sectors separate could be achieved by having two “work in progress” boards, created and maintained independently of each other.
Tools like PagerDuty or ZenDuty should also be implemented. These on-call support systems can be really helpful when trying to triage problems or inquiries. Tools such as these should be integrated with alerts across all components since the instability of one component has a cascading impact on all teams using that specific version. Consider auto-route alarms for any issues that can only be detected in consuming services and pertain to the platform team in question.
7. Building metrics
Alerts and monitoring give a real-time overview of the health of the system. Building metrics to help the platform team check usage and analyze the performance of components over time is, therefore, a must. There are two types of metrics:
- Technical: these could be things like how many product teams are using the platform, how many are on the latest version, how many releases of the platform team components were done in a specific timeframe, the number of bug fixes that had to be done, etc.
- Business: if we take the example of identity management, these metrics could be the number of unique logins, failures, causes for failures, suspicious activities, etc., in a given duration.
Both these elements will enable the team to identify the health of the platform and highlight any issues.
Final thoughts
Like product teams, the overall adaptability and success of tools and features built by platform teams depend on the exposed interface, in-depth thoughtfulness, and the support system put in place. Implementing these seven steps will help you to create an effective platform team for your organization.