You have 1 article left to read this month before you need to register a free LeadDev.com account.
Ahead of his talk at LeadDev West Coast in October, OpenAI’s Evan Morikawa goes behind-the-scenes on the challenges they faced scaling ChatGPT, the constant need to reset as a leader, and how GPUs don’t in fact grow on trees.
It was one of the most significant technology launches of a generation when OpenAI decided to throw open the floodgates and give users access to its large language model-powered ChatGPT tool. Since then, more than 100 million people have prompted the generative AI tool to answer questions, write computer code, and even write their resumes.
Ahead of his behind-the-scenes peek at how they did it at LeadDev West Coast later this year, LeadDev’s Editor in Chief, Scott Carey (SC), checked in with Evan Morikawa (EM), Applied Engineering Manager at OpenAI, to hear more about how his team overcame this unique scaling challenge.
The below conversation has been edited for clarity and brevity. Evan’s full talk will be delivered at LeadingEng West Coast on October 18 in Oakland, California.
SC: We’re so excited to welcome you to LeadDev West Coast later this year. What are you going to be talking about?
EM: I would like to tell everyone a little bit about how we launched and scaled ChatGPT, including some of the engineering challenges and a little bit of the backstory under the hood with regard to what happened to make that grow.
Some of these aspects, I think will be pretty familiar to anybody who’s scaled engineering teams and back-end infrastructure systems, but I think there are some unique challenges about the way these large language models (LLMs) work and behave, that I will get into.
SC: What was one challenge that really struck you when you were trying to scale ChatGPT to the world?
EM: GPU supply is by far one of our biggest challenges, and remains a challenge today. I will go into much more depth about that during the talk, but when you’re multiplying billions and billions of numbers together, you have to have this very specialized hardware. I don’t think anybody, including us, was really prepared to make this happen, supply chain wise. As you can imagine, it’s fairly difficult to turn around new chips on a dime. I think we’ve all seen that over the past several years and it only continues today.
SC: Was there anything that surprised you about the ChatGBT launch?
EM: I would say that we had a pretty strong hunch about a couple of things going into it. Internally, we had already had the chance to play with these things and felt like they were really fun, but you don’t really know how anybody’s going to react. Several months prior to the launch, we launched the image generation tool Dall-E, and that was also an extremely cool, fun thing internally. But how that [ChatGBT] would take off was completely unpredictable.
We did know that this was going to be the first time that we had no waitlist on a product, which was an intentional decision when launching here because historically, capacity constraints and safety reasons led us to use a waitlist, but no one likes waitlists.
That being said, when we did launch, you could talk to these models already through our API developer playground, so in effect, we came into this thinking that not a huge amount would change. The models were slightly newer and safer, we had not released GPT4 yet. This was hitting a nerve with certain people here because it was not exactly a predictable thing.
SC: As an engineering leader, when you go through a launch of that scale, how do you re-adjust after that?
EM: You readjust continuously. Everything was basically changing day-to-day at that point. Not just on the technology and infrastructure side, we very quickly realized that we also need to grow the size of the team as well.
As you grow any sort of organization, with all of the different step functions that you have, when the complexity of your team structures gets larger, when there are more people to coordinate with, and a lot of new people – all of these things definitely hit us and trying to get ahead of a lot of that has been a large personal focus of mine, as well as a lot of the other teams as well, in addition to all of the engineering that all of our teams have had to do to make that happen.
SC: What’s one thing that you’re hoping the audience takes away from your talk?
EM: I’d like to demystify some of this whole “AI is a magical black box” thing. Certainly, it’s got a lot of attention as a field recently, but I still think there is a feeling that the technology and infrastructure behind this is a bit of a black box. In fact, there’s a huge amount of research taking place to unveil what is happening and make it interpretable.
This was one of the earliest things that caused me to join OpenAI back in 2020. I generally thought I knew how computers worked, sort of, except for this area, which still felt a little bit magical. At the end of the day, when you break things down, and you’re forced to look at the realities of scaling a system like this, you treat it somewhat like any other engineering system, with a couple of unique quirks to it, like the implementations of how various chips and things like that work.
The research side has lots of different ways of thinking about these models, but from the engineering, deployment, inference, and scaling side, it has been quite approachable for me and the team. In fact, for nearly all of our applied engineering group, having deep machine learning experience has not been a prerequisite. Most of our challenges are not theoretical, statistics, or mathematical. They’re about really deeply understanding some part of a system, building fault-tolerant, distributed systems, and reliable pieces of software that other humans know how to read, interpret, and build upon.