Meta lifts the lid on its WhatsCode AI agent

The agent was able to increase privacy coverage 3.5x over 25 months.

By Chris Stokel-Walker

December 17, 2025

You have 1 article left to read this month before you need to register a free LeadDev.com account.

Estimated reading time: 4 minutes

The tech giant just gave us an indication of how important internal AI assistants are – but what can the rest of us learn from it?

The adoption of AI coding assistants has been widespread – but it’s rare that we get an insight into what the biggest technology companies are doing themselves.

Now, a new study published by eight engineers at Meta – the parent company of Facebook, WhatsApp, Instagram and other services – gives us insight into the type of problems they are looking to solve internally using this new breed of tools.

Your inbox, upgraded.

Receive weekly engineering insights to level up your leadership approach.

What is WhatsCode?

In a paper describing an internal assistant called WhatsCode, a group of Meta engineers describe how they were able to increase the coverage of WhatsApp’s internal privacy rules about how a feature can or can’t handle data by 3.5x over 25 months, while generating more than 3,000 accepted code changes in that time.

WhatsCode, as described, is closer to an “agentic workflow”, where software doesn’t just suggest snippets but takes on bounded tasks, proposes changes, and routes them through review. Meta engineers have built a retrieval-augmented generation (RAG) setup, where an internal vector database sends prompts to the Llama 3 large language model (LLM), or other models, depending on the task.

So far this approach has proved relatively successful in tackling thorny code issues. (The authors of the paper, and Meta’s press office, did not respond to repeated interview requests for this story.)

More like this

“It’s a super interesting paper, because it’s really looking at the recent agentic workflow of developers, not just using ChatGPT in a chat type of environment, or assistant type of environment,” says Margaret-Anne Storey, a professor of computer science at the University of Victoria who studies how developers work. “There’s not a lot of studies of that kind of workflow happening right now.”

Storey points to the paper’s attempt to quantify the extent to which AI proposals pass human judgement – what she describes as “looking at the difference between when developers just say, ‘Yes, I accept that change,’ versus ‘No, I don’t quite like the change that is being done’ and then stepping in and then reviewing it.” Within Meta, the reviewers reportedly accepted around 60% of the changes while only 40% triggered edits or pushback.

A measure of success

The AI system is far from perfect, but does act as a reminder of what makes AI code assistants viable. They need strict boundaries, careful oversight – and lots of organizational support.

Matt Burch, an engineer and product builder, argues that’s the real story here. “The thing that I really noticed about this is the overwhelming concept of governance,” he says. Burch is blunt about what AI still gets wrong: “AI is not great at codebase hygiene. It’s not great at making sure that a certain file isn’t, like, 3,000 lines long.”

What this research shows is that bigger context windows don’t magically produce reliable software engineering. What does help, Burch says, is the deployment he sees in the WhatsApp case: “they have it governed down to the nth degree,” where “you still have human in the loop” for complex choices.

Storey stresses that WhatsCode appears to be aimed at the very specific case of privacy-related changes. “The thing the Meta paper has is that they have a very specific context in which they’re using the generative AI tool, and they have a lot of data to learn from, so that their model is very specific to their codebase and very specific to this task,” Storey says. That means while it works well for them, it might not work for others.

Lessons to learn

Storey’s warning is less about AI hallucinating and more about teams hollowing out their own understanding. In startup settings she’s worked with, she says, there’s pressure to move fast – until it backfires. “When they use gen AI to go really fast, at some point they find that they can’t make changes right. Like they don’t understand the code, so they start to lose control.”

For engineering managers, the harder part isn’t turning AI on, it’s changing the organization around it. Burch thinks adoption has to be mapped across the whole PDE (product, design and engineering) pipeline, not dumped on developers at the end. “Unless your whole PDE pipeline is built on a system of AI augmentation, you’re going to find a lot of challenges with adoption,” he says.

LDX3 London 2026 agenda is live - See who is in the lineup

London • June 2 & 3, 2026

LDX3 London agenda is live! 🎉

See lineup

Storey is more focused on the “productivity paradox” where leaders demand speed but don’t create the conditions for safe learning. “They need to be having those conversations with their developers and listening to who it’s working for, and who it’s not.” That also means you have to take the eye-popping findings with a heavy pinch of salt. “Metrics are not enough,” she says. “They need to talk to their developers.”

About the author

Chris Stokel-Walker

Chris Stokel-Walker is a freelance journalist based in the UK.
- @stokel

Newsletters

Panel discussions

Videos

Reports

For you

London

Meetups

New York

Berlin

Meta lifts the lid on its WhatsCode AI agent

By Chris Stokel-Walker

Your inbox, upgraded.

What is WhatsCode?

More like this

A measure of success

Lessons to learn

About the author

Chris Stokel-Walker

London

Meetups

New York

Berlin

Meta lifts the lid on its WhatsCode AI agent

By Chris Stokel-Walker

Your inbox, upgraded.

What is WhatsCode?

More like this

A measure of success

Lessons to learn

Share:

About the author

Share:

More like this