Evan Morikawa on how OpenAI scaled ChatGPT

Ahead of his talk at LeadDev West Coast in October, OpenAI’s Evan Morikawa goes behind the scenes on scaling ChatGPT, the constant need to reset as a leader, and how GPUs don’t grow on trees.

By Anastasija Kovacevic & Scott Carey

Speakers: Evan Morikawa

March 28, 2024

You have 1 article left to read this month before you need to register a free LeadDev.com account.

Ahead of his talk at LeadDev West Coast in October, OpenAI’s Evan Morikawa goes behind the scenes on scaling ChatGPT, the constant need to reset as a leader, and how GPUs don’t grow on trees.

It was one of the most significant technology launches of a generation, when OpenAI decided to throw open the floodgates and give users access to its large language model-powered ChatGPT tool. Since then, more than 100 million people have prompted the generative AI tool to answer questions, write computer code, and even write your resume for you.

Ahead of his behind-the-scenes peek at how they did it at LeadDev West Coast later on this year, we checked in with Evan Morikawa, an Applied Engineering Manager at OpenAI, to hear more about how his team overcame this unique scaling challenge.

The below conversation has been edited for clarity and brevity. To see Evan’s full talk, you will need to buy a ticket for LeadDev West Coast between October 16-17 in Oakland, California.

Scott Carey, LeadDev: Evan, we’re so excited to welcome you to LeadDev West Coast later on this year. Can you just tell us a little bit about what you’re going to be talking about?

Evan Morikawa, OpenAI: I would like to tell everyone a little bit about how we launched and scaled ChatGPT, including some of the engineering challenges and a little bit of the backstory under the hood with regards to what happened to make that grow.

Some of these aspects, I think will be pretty familiar to anybody who’s scaled engineering teams and backend infrastructure systems, but I think there are some unique challenges about the way these large language models work and behave, that I will get into.

Scott: What was one challenge that really struck you when you were trying to scale ChatGPT to the world?

Evan: GPU supply is by far one of our biggest challenges, and remains a challenge today. I will go into much more depth about that during the talk, but when you’re multiplying billions and billions of numbers together, you have to have this very specialized hardware. I don’t think anybody, including us, was really prepared to make this happen, supply chain wise. As you can imagine, it’s fairly difficult to turn around new chips on a dime. I think we’ve all seen that over the past several years and it only continues today.

Scott: Was there anything that surprised you about the chat GBT launch?

Evan: I would say that we had a pretty strong hunch about a couple things going into it. OInternally, we had already had the chance to play with these things and felt like they were really fun, but you don’t really know how anybody’s going to react. Several months prior to the launch, we launched the image generation tool Dall-E, and that was also an extremely cool, fun thing internally. But how that would take off was completely unpredictable.

We did know that this was going to be the first time that we had no waitlist on a product, which was an intentional decision when launching here because historically, capacity constraints and safety reasons led us to use a waitlist, but no one likes waitlists.

That being said, when we did launch, you could talk to these models already through our API developer playground, so in effect, we came into this thinking that not a huge amount would change. The models were slightly newer and safer, we had not released GPT4 yet. This was hitting a nerve with certain people here because it was not exactly a predictable thing.

Scott: As an engineering leader, when you go through a launch of that scale, how do you re-adjust after something like that?

Evan: You readjust continuously. Everything was basically changing day to day at that point. Not just on the technology and infrastructure side, we very quickly realized that we also need to grow the size of the team as well.

As you grow any sort of organization, with all of the different step functions that you have, when the complexity of your team structures get larger, when there’s more people to coordinate with, a lot of new people – all of those things definitely hit us and trying to get ahead of a lot of that has been a large personal focus of mine, as well as a lot of the other teams as well, in addition to all of the engineering that all of our teams have had to do to make that happen.

Scott: What’s one thing that you’re hoping the audience takes away from your talk?

Evan: I’d like to demystify some of this whole AI as a magical black box thing. Certainly it’s got a lot of attention as a field recently, I still think there is a feeling that the technology and infrastructure behind this is a bit of a black box. In fact, there’s a huge amount of research taking place to unveil what is happening and make it interpretable.

This was one of the earliest things that caused me to join OpenAI back in 2020. I generally thought I knew how computers worked, sort of, except for this area, that still felt a little bit magical. At the end of the day, when you break things down, and you’re forced to look at the realities of scaling a system like this, you treat it somewhat like any other engineering system, with a couple of unique quirks to it, like the implementations of how various chips and things like that work.

The research side has lots of different ways of thinking about these models, but from the engineering, deployment, inference, and scaling side, it has been quite approachable for me and the team. In fact, for nearly all of our Applied Engineering Group, having deep machine learning experience has not been a prerequisite. Most of our challenges are not theoretical, statistics, or mathematical. They’re about really deeply understanding some part of a system, building fault tolerant, distributed systems, and reliable pieces of software that other humans know how to read, interpret, and build upon.