You have 1 article left to read this month before you need to register a free LeadDev.com account.
Estimated reading time: 10 minutes
Key takeaways:
- AI agents are now collaborators, not just tools, and that changes how teams work.
- Hybrid human-agent engineering teams must be intentionally designed like distributed systems.
- Effective integration requires clearly defining four layers: work, coordination, shared context, and governance.
If you are a modern engineering leader, it’s likely that some of your team’s most consistent contributors aren’t human. Designing human-agent engineering teams is now a key requirement.
On my team, AI agents are writing pull requests, summarizing incidents, triaging tickets, and surfacing patterns in production data, doing so asynchronously and sometimes without human interaction.
By agents, I mean systems that act independently to complete tasks, not autocomplete or code suggestions. On my team, we use Claude Code to write pull requests and run cloud sessions for background tasks. We also use Linear to triage tickets, and a custom workflow that takes a product requirements document from a bug created there and automates the fix. These tools work asynchronously and sometimes without anyone prompting them.
However, the way we collaborate through standups, retrospectives, and design reviews is still based on human behaviors: limited attention, context created through conversation, and a predictable pace of contribution.
When non-human contributors become part of our rituals, friction occurs. There is more data, more output, and less clarity as to what to do with it all.
A common first instinct is to view this as a tooling problem. It isn’t. The real challenge is how your team collaborates. Fortunately, engineering leaders already possess the necessary skills to solve this problem, because building a team that contains human and non-human contributors is very similar to designing a distributed system.
Your inbox, upgraded.
Receive weekly engineering insights to level up your leadership approach.
Tools begin to act like teammates
Most teams don’t have a specific moment when tools cease to be tools and become teammates. Instead, it occurs over time.
An agent begins to summarize incident timelines prior to your team’s retrospectives. Another creates automated test cases during your team’s pull request reviews. A third categorizes customer feedback prior to your team’s sprint planning. While each of these is helpful individually, collectively, they alter the nature of how your team collaborates.
Usually, the first indicator of issues stemming from this type of activity occurs in standups. An agent-created update provides more information than any human update, but the remaining team members find it difficult to determine what information is important (and should be acted upon), versus what is simply irrelevant ‘noise.’
During design reviews, automated analysis may identify a dozen edge cases, but it is unclear which ones represent a conscious tradeoff by your team, versus which ones your team has never considered.
Eventually, your team relies on these outputs without understanding the origins of the outputs. As such, your team experiences decreased productivity and increased cognitive load, despite the fact they are completing more tasks overall. That is the warning signal. The issue is not related to the capabilities of the agent;, it is that no one has established intentional parameters regarding how the agent participates in collaborative activities.
Therefore, the question for leaders at this point is no longer ‘whether to utilize AI agents,’ but rather ‘how to incorporate AI agents’ so that human judgment remains the primary focus of decision-making and your team can still operate effectively.
Human-agent engineering collaboration as a system
Engineering leaders frequently describe teams as systems, but they typically use this as a metaphor. Most teams are structured to report to a manager and attend recurring ceremonies (such as standups, planning meetings, and design reviews), not to facilitate how information flows throughout the team and how responsibilities are allocated.
Agents highlight the differences between these two approaches. They do not establish context through casual conversations. Agents do not forget decisions unless the information is specifically deleted from their inputs. They do not honor organizational boundaries unless they are specifically defined.
To develop a more practical method for addressing this challenge, consider adopting a perspective from how you would truly design a distributed system. Distributed systems exhibit the following properties:
- Each contributor has a clearly-defined scope of responsibility.
- The methods for coordinating efforts among the contributors are clearly defined.
- The shared context is maintained intentionally.
- The governance structure ensures long-term alignment.
In most human teams, the above aspects are inherent. In human-agent engineering teams, where humans and agents work together, they cannot be. Consider this as four layers:
The work layer
At the bottom of the hierarchy is the work layer, comprised of individual engineers, sub-teams, and agents working in a defined area. Without a clear definition of responsibilities within this layer, you experience duplicate effort or gaps in ownership.
For example, if both an engineer and an agent are generating test cases for the same service without either knowing about the other, you end up with redundant coverage in some areas and blind spots in others.
The coordination layer
Above the work layer is the coordination layer, which includes your standups, planning meetings, and design reviews. The primary objective of the coordination layer is to ensure the correct information is delivered to the correct individuals. When agents participate without clear guidelines, the layer becomes overwhelmed with information.
Consider an agent that posts a detailed summary to your team’s Slack channel every morning, covering every ticket update, alert, and deployment from the previous day. Without guidelines on what to surface and what to filter, engineers start ignoring the summary entirely, and the coordination value is lost.
The shared context layer
Above the coordination layer is the shared context layer, comprised of documentation, decision records, runbooks, and system histories. Agents interact extensively with this layer, both consuming and generating information. Without intentional maintenance of this layer, agents will amplify outdated assumptions or present information without providing explanation.
For instance, if your runbook for a particular service still references an old deployment process, an agent summarizing an incident will confidently cite those outdated steps, and engineers unfamiliar with the change may follow them.
The governance layer
The final layer is the governance layer, which defines boundaries surrounding ownership, accountability, and risk. In this layer, you define which decisions can be automated, which must be made by a human, and how responsibility is assigned when the outcome is uncertain.
A common example is an agent that auto-approves and merges low-risk pull requests based on predefined criteria. Without clear governance around what qualifies as “low-risk,” a configuration change that appears minor could bypass human review and cause a production incident, with no one clearly accountable for the decision.
You do not need to formally define all four layers immediately, but naming the layers allows you to determine where the friction is occurring.
If your standups are becoming overwhelming, then you have a problem with your Coordination Layer. If agents continually present stale information, then you have a problem with your Shared Context Layer. Having the vocabulary enables you to intervene in a precise manner, rather than introducing a broad process.
More like this
Modifying rituals without replacing them
The ultimate goal is not to create new rituals, but to modify existing ones so that they remain functional when the contributors are no longer solely human. Successful modifications are generally small and include clarifying how agents will participate, what agents will contribute, and where human judgment is still required.
Standups
Standups are most effective when updates are concise, relevant, and easily understandable. Agent-generated summaries may be thorough, but more information does not result in greater alignment. I learned this first-hand when an agent’s standup summary stated that a pull request was merged when it had actually been closed.
The team proceeded as if the work was complete, and it wasn’t until a subsequent review that someone identified the error. The agent had recognized the status change, but had not captured the context behind it. A reviewer had flagged a critical issue and the pull request was closed, not merged. That is the type of gap that can rapidly diminish trust if it occurs more than once. Teams that address this successfully treat agent updates as asynchronous inputs to the team’s meeting, rather than real-time contributions.
Before the meeting, the agent writes a summary of what happened overnight (what alerts fired, what tickets moved, etc.). The meeting itself is used to interpret the information provided, to make decisions, and to coordinate the next steps. Agents perform recall functions, while humans determine context and tradeoffs.
Design reviews
Automated analysis can rapidly identify edge cases and verify assumptions faster than a human reviewer. However, the risk is not that the analysis is incorrect. It is that the lines between automated feedback and deliberate design decisions are increasingly blurred.
I have observed design reviews where an agent identified multiple edge cases and the team debated one that the original designer had previously considered and intentionally excluded. No one could discern from the review thread which concerns were new and which had previously been accounted for.
The simple answer is to treat agent-generated content as analysis and not as judgment. Designers are accountable for the explanations of their decisions, descriptions of alternative considerations, and flags indicating uncertainty. This maintains accountability while allowing agents to expand the potential solutions.
Retrospectives
Retrospectives often benefit from automation in the data collection phase, such as aggregating incident timelines, identifying trends, and summarizing results across systems. Without a clear boundary, this can overwhelm a team with data that appears unrelated to the team’s direct experience.
One team I collaborated with had an agent-generated retro summary that was so comprehensive that the meeting became focused on reviewing the summary itself rather than discussing what the team had learned. The distinction that matters is between data preparation and reflection. Allow agents to compile the raw materials (timelines, metrics, patterns) and then allow the team to use the retro to engage in shared learning and forward-thinking improvements. Agent-generated data becomes a resource for the team’s discussions, not a substitute.
In all three types of rituals, the same rule applies: agents support coordination by performing recall, aggregation, and analysis functions. Humans retain responsibility for interpretation, prioritization, and judgment. When the boundaries between these roles are left undefined, teams experience overload. When they are intentionally defined, teams’ existing rituals continue to be effective, even as the team changes.

London • June 2 & 3, 2026
LDX3 London agenda is live! 🎉
What happens when no one sets the boundaries?
It is beneficial to be candid about the failure modes. If a team over-designs how an agent will integrate with their workflow, then it may become overly bureaucratic and limit their ability to develop new products rapidly. This includes creating excessive rules as to what an agent can or cannot contribute, and approval processes that negate the purpose of automation. The goal is clarity, not control.
On the other end, teams that take no action will observe their engineers quietly developing workarounds to bypass the limitations imposed by agents. Some engineers will disregard agent-generated outputs altogether. Other engineers will blindly trust the outputs generated by the agents. Both are symptoms of the same issue: the lack of mutual expectations about how agents contribute to the workflow of the team.
Additionally, there is a human dimension. Engineers may experience diminished self-worth when an agent completes tasks faster and/or more reliably than they can. Others may feel relieved to have repetitive tasks performed by the agents and be uncertain about the value of their contributions.
If you are managing a team transitioning to a hybrid model, these are not issues that will resolve themselves. Discuss these dynamics in one-on-one sessions. Acknowledge the shift in retrospectives. Frame your team’s work in terms of the judgments and decisions only humans are capable of making, rather than the quantity of output. Leaders who publicly acknowledge these dynamics provide a safe environment for employees to adapt, rather than resist.
Redesign how you collaborate AI-human engineering teams
Agents typically reveal collaboration issues, rather than generate them. Agents highlight the existing assumptions of ownership, communication, and shared context. That can be an uncomfortable revelation, but it is an opportunity.
The most successful responses will not be additional tools or processes. Rather, it will be being more intentional about the design of collaboration. Establish clear expectations of responsibility, define how information will flow, intentionally maintain shared context, and establish boundaries for decision-making.
Teams that successfully navigate this transition will not be the ones that have the most advanced AI tools. Rather, they will be the teams that invest in designing how humans and agents collaborate, and view that design with equal seriousness as the software they deliver.
If you have not yet engaged in that dialogue with your team, tomorrow’s standup meeting is a good place to initiate that conversation.