Hard-won lessons on building and scaling ML models

How small teams can deploy highly adaptable autonomous systems.

Guests: Kevin Peterson

December 19, 2025

You have 1 article left to read this month before you need to register a free LeadDev.com account.

Estimated reading time: 7 minutes

From self-driving cars to excavators: CTO at Bedrock Robotics, Kevin Peterson, shares foundational wisdom for creating AI for the real world.

“Build the machine that builds the machine.” It’s a phrase used by Bedrock Robotics’s CTO, Kevin Peterson, to describe his team’s approach to custom artificial intelligence (AI) that can learn from data, adapt its behavior to different situations, and self-diagnose problems.

The move toward autonomous machines is gaining momentum in the physical world, from self-driving cars to drones, manufacturing, and other areas of robotics. And with it, the field of machine learning (ML) – once only a black box for the most elite data scientists – is now seeing increased interest as organizations seek to build or fine-tune models for their own unique purposes.

One such purpose is to automate the many manual labor gaps in the construction industry. “Oftentimes it’s hard to find people to do that work,” says Peterson. “Projects get behind, and there’s a very high cost due to delays.”

Peterson – who formerly led perception at self-driving car company Waymo and served as an autonomy architect at construction equipment manufacturer Caterpillar (CAT) – now leads a team at Bedrock Robotics that is creating ML models to automate construction equipment. With their technology, excavation diggers can work in the field free of human controls.

Through his long tenure in AI/ML, Peterson has collected some helpful data and leadership knowledge. Lately, he’s improved engineering team efficiency by tuning and iterating a large model that learns and adapts across many tasks, rather than relying on a lot of small, decoupled, fit-for-purpose models. He has found that, in general, the development of safe, autonomous AI/ML systems requires high operational excellence paired with a collaborative team dynamic.

Your inbox, upgraded.

Receive weekly engineering insights to level up your leadership approach.

One general model can be effective

“Ten years ago, we were really trying to control the architecture and make individual pieces that talk to each other,” says Peterson. He recounts how, at Waymo, his team created various bespoke models for specific areas like fetching, tracking, and classification, then chained them together to talk to one another.

However, this decoupled approach required APIs between models to network them, which incurred a maintenance burden and made it challenging to evolve the system as a whole. “If you don’t aggressively replace components, you accumulate tech debt,” says Peterson.

Now, at Bedrock, the design strategy is slightly different. Their approach is largely to deploy a single end-to-end model and give it the autonomy to choose its specific capabilities at training time. “The model itself figures out the best representation for its problem,” says Peterson. “We’ve baked Bedrock on that thesis.”

To achieve this state, the system feeds raw inputs directly into a giant central model in the middle, training it as directly as possible. According to Peterson, this approach grants their model better reasoning power in the moment while still being sensor-driven and evaluation-constrained.

This design doesn’t eliminate system boundaries or safety checks altogether, but it does reduce the number of interfaces the model has to reason across. As a result, Peterson says a very small team can build and maintain systems like this, since much of the complexity shifts away from managing interfaces and into data curation and evaluation.

Large models are surprisingly adaptable

Bedrock’s end goal is to create big machine learning models with broad applications for robotics. While they’re currently focusing on automating excavators, the plan is to adapt the model from machine to machine over time and expand from construction machinery to other areas, like agricultural machines.

While working in the autonomous trucking program at Waymo, Peterson recalls that it only took ingesting 10% or so more truck-specific data to get a model originally designed for cars working well for trucking.

The takeaway is that an interchangeable large model can be adapted across use cases (with similar-enough overlap in physical dynamics) with less effort than training it from scratch. However, this doesn’t mean Bedrock isn’t trying to perfect certain areas before becoming a platform for other machines.

“I really believe you develop products by starting with concrete use cases, before becoming a platform,” says Peterson. “The dream is to get to a general model, but in the physical world, we’re balancing safety and capability.”

New models need data and evaluations

Part of developing first-of-its-kind AI for a new use case is collecting fresh, novel data to train your models. Sometimes, this must be collected in the field – quite literally in the case of construction. “We had to build up a lot of datasets that didn’t exist,” says Peterson.

Bedrock Robotics’ technology is trained on data collected from sensors that monitor how heavy machinery operates in very large infrastructure projects, like building data centers, power stations, or factory foundations. In this high-stakes construction environment, safety is a top priority. So, they’ve had to gather massive amounts of data upfront and run many simulations before deploying.

Starting with a highly controlled data set greatly benefits safety, says Peterson. Since the scope is limited to the domain in which it operates, it reduces the risk of confusion.

Data collection should happen upfront, be used to train the model offline, and then be validated with rigorous evaluations and testing early on, adds Peterson. In ML, evaluations help teams monitor and stay in control of complex systems, which is especially important for AI that interacts with the real world.

Therefore, project leaders should evaluate certain metrics to understand the use and accuracy of models in production. “We measure performance characteristics on a very specific data set and use cases,” says Peterson. In their field of construction, for example, this might equate to how well the model performs at digging a trench or loading a truck. If you’re doing safety right, you’ll never observe failures in production, like collisions, he adds.

All that said, just because you need unique data for a novel ML project doesn’t mean you must reinvent the wheel for other components of the software infrastructure. Bedrock Robotics’ process was to start with an open-source foundational model and then retrain it with their domain-specific data, saving considerable upfront engineering efforts.

The cultural side of the equation

Last but not least is the most important aspect: the people side of AI/ML. When building machine learning products, leaders must create a healthy engineering culture that balances two key facets: technical excellence and collaboration. “They both matter a ton,” says Peterson.

The first aspect involves thinking from a product management point of view: setting milestones, writing design documentation and proposals, and pushing the team to continually evolve the product.

The other is the team element: “People have to just love working together – many people don’t think about that aspect,” says Peterson. “We try hard to hire people you’d love to have a beer with, who are compassionate, and have a relatively low ego.”

In ML development, training the actual algorithm is just the start – 99% of the work is bug-fixing, says Peterson. So, he stresses that the culture should incentivize and reward those who maintain the infrastructure that’s already there. “That matters as much as the algorithm side, so you have to be careful to balance both.”

One interesting takeaway is that you don’t always have to throw more people at the problem. Peterson recalls when working at Waymo, a common response to engineering obstacles in AI/ML was to hire more or grow teams with other individual contributors.

Instead, Peterson encourages a more proactive approach to automating the discovery of problems. Rather than manually reviewing logs, he suggests using tools that automatically cluster similar failures, allowing teams to automate parts of the incident triage process.

“I strongly believe that we need to think about tools behind the scenes that let us inspect our tools, and cleave off problems, in a very automated way,” says Peterson.

In other words, if you build the machine that builds the machine, you reap increased efficiency and potentially reduce the headcount required to maintain ML systems.

New York • September 15-16, 2026

Speakers Gergely Orosz, Will Larson and Frances Thai confirmed 🙌

Explore LDX3

The machine that builds the machine

Although Bedrock Robotics is working in quite a niche field, the lessons garnered here are relevant to software engineering teams working outside of the hardware world as well: as AI agents begin to become more autonomous and conduct real-world actions, they are now faced with similar challenges in terms of data collection and retraining, all while retaining accuracy and safety.

It will take a combination of operational excellence, rigorous testing, and smart architectural design to safely grant automation higher degrees of power. This might mean opting for less architectural modularity, as Bedrock has done, or deliberately collecting novel datasets that go beyond your off-the-shelf algorithms.

No matter the specifics, what’s certain is that it’ll take strong leaders to guide that team (the internal machine, if you will) to build the next generation of intelligent systems operating in the real world.

About the author

Bill Doerrfeld

Bill is a tech journalist and thought leader specializing in state-of-the-art technologies in the enterprise IT space.

About the guest

Kevin Peterson

Kevin is the co-founder and chief technology officer at Bedrock Robotics.
- Kevin-peterson/Linkedin

Newsletters

Panel discussions

Videos

Reports

For you

London

Meetups

New York

Berlin

Hard-won lessons on building and scaling ML models

By Bill Doerrfeld

Guests: Kevin Peterson

Your inbox, upgraded.

One general model can be effective

Large models are surprisingly adaptable

More like this

New models need data and evaluations

The cultural side of the equation

The machine that builds the machine

About the author

Bill Doerrfeld

About the guest

Kevin Peterson

London

Meetups

New York

Berlin

Hard-won lessons on building and scaling ML models

By Bill Doerrfeld

Guests: Kevin Peterson

Your inbox, upgraded.

One general model can be effective

Large models are surprisingly adaptable

More like this

New models need data and evaluations

The cultural side of the equation

The machine that builds the machine

Share:

About the author

About the guest

Share:

More like this