You have 1 article left to read this month before you need to register a free LeadDev.com account.
Estimated reading time: 3 minutes
Key takeaways:
- Enterprise AI agents leak sensitive data by design. Frontier models show violation rates up to 51% because LLMs can’t separate useful data from contextually inappropriate information.
- More capability means more leakage.
- The model can’t police itself: least privilege access, context-aware filtering, and audit logs must be built in from the start.
CTOs, engineering managers, and staff engineers are rushing to deploy autonomous AI agents across their businesses – either through their own volition or because of the clamor of demand from rank-and-file workers. However, they should think twice, a new study shows.
Enterprise large language model (LLM) agents are likely leaking company secrets, and throwing more compute at the problem is only making it worse, the study finds.
In part, that’s because of the AI’s ability to retrieve and synthesize vast amounts of internal data, from Slack messages to board transcripts, to automate tasks. By gathering that information, they also create issues with contextual integrity.
When retrieving dense corporate data, these agents routinely fail to disentangle essential task data from sensitive, contextually inappropriate information. Higher task completion rates often directly correlate with increased privacy violations.
For instance, when asked to negotiate a software renewal, an agent correctly included vendor-safe details such as current usage and competitor benchmarks, but also disclosed internal negotiation tactics, contingency budgets, and a planned acquisition that would increase future seat needs.
Your inbox, upgraded.
Receive weekly engineering insights to level up your leadership approach.
AI risks and rewards
“We wanted to know whether AI can complete a complete enterprise computer user task in an enterprise scenario, and also keep user privacy,” says Wenjie Fu, a researcher at Huazhong University of Science and Technology in China, and the lead author of the paper.
Fu says the research was prompted by the way enterprise tools are beginning to fold AI directly into internal workflows, including in chat apps like Slack that are used across organizations.
To test that, Fu and his colleagues built CI-Work, a benchmark of 125 simulated enterprise tasks across five kinds of workplace information flow: upward reports to bosses, downward communications to staff, lateral peer collaboration, diagonal cross-team work, and external stakeholder engagement.
In each scenario, the agent had to retrieve useful information from enterprise-style tools while avoiding semantically related but inappropriate material, the kind of thing that might sit nearby in a Slack thread, meeting transcript, or email chain.
They ended up finding that frontier AI models suffer from contextual integrity privacy violation rates of between 16% and 51%. “Even being state-of-the-art cannot balance the tradeoff between privacy and utility,” says Fu.
Beyond the basics
The problem is not limited to malicious prompt injection. In the study, even mundane workplace behavior made agents worse.
When users asked the system to be “thorough” or pointed it towards specific sources, leakage and violation rates rose. Explicitly steering the model towards relevant documents nearly doubled the baseline violation rate, while also reducing how much essential information the agent conveyed.
Part of the problem is the way LLMs work, says Eerke Boiten, professor in cybersecurity at DeMontfort University. “Constraints, particularly about leaking information, are about things that do not happen at times when they could,” he explains. Humans can handle that carefully based on our understanding of the world, but AI systems struggle.
“LLMs are firmly about extrapolating from events that do happen,” says Boiten. “Observing what does happen will only ever give you a partial view of what could possibly happen.” That means they can’t consider the possible ramifications – and can’t conceive that it could occur.
“The absence of an undesirable event will not be within the LLM’s awareness,” Boiten says. That causes issues when implemented in working processes.
More like this
The impact on engineering
It also requires a rethink from engineering teams considering using these tools. Teams can no longer assume that an LLM will make the right judgment call on its own when parsing a corporate haystack. It means that devs have to design agentic systems from the ground up with strict privacy engineering principles.
Fu says AI models might grasp high-level organizational boundaries, but they fundamentally struggle to adjudicate fine-grained, role-specific information flows. This requires implementing rigorous least privilege access controls, context-aware privacy mechanisms, and role-conditioned filtering before the data ever reaches the LLM’s prompt window.
Without robust external safeguards and comprehensive audit logs, deploying enterprise LLM agents will remain a critical corporate vulnerability.
Fu’s conclusion is blunt: the model cannot be left to police the boundary. “We cannot just trust the model itself,” he says. Companies need “a comprehensive, trustworthy system” around the agent, he says – a harness of code, policies, and retrieval controls that determines what information can be shown to whom.

New York • September 15-16, 2026
Speakers Camille Fournier, Gergely Orosz and Will Larson confirmed 🙌
That means the safest enterprise agent may not be the most capable model, Fu argues, but the best-constrained system. Dev teams should build with that in mind.