#define AI Engineer - Greg Brockman, OpenAI (ft. Jensen Huang)
Channel: aiDotEngineer
Published at: 2025-08-10
YouTube video id: avWhreBUYF0
Source: https://www.youtube.com/watch?v=avWhreBUYF0
[Music] [Applause] [Music] Well, hello. Hello. Is uh mic working for you? Check. Check. Check. One, two, three. All right. First hard technology problem of the day down. Yeah. Yeah. Well, the Wi-Fi is the other one. Um, everyone here knows. Um, so Greg, welcome to AI Engineer. Thank you so much for taking the time. Thank you for having me. Um, we're going to go a little bit chronologically and, uh, a lot of people send in questions and I've sort of grouped them up for you. So, we're just get right into it. U, so, you know, you you know, I did some deep research on you. uh you started with deep research with with deep research. Um I called it Peep research because you were researching a person. Uh you actually did theater growing up and chemistry and math and you wrote a calendar scheduling app and that's what got you into coding. But like what really inspired your love for coding? Like why why are you the coding guy? Well the funny thing is I thought I was going to be a mathematician when I grew up. Yeah. You know I'd read about people like Gawwa and Gaus. you know, we work working on these like hundred, 200, 300 year time horizons. And I was like, that's what I want to do. If anything that I come up with is ever used while I'm still alive, it wasn't long-term enough. It wasn't abstract enough. Um, and I was writing this chemistry textbook after high school, sent it to one of my friends who' done something similar in math, and he said, "No one is going to publish this. You can either self-publish." I was like, "Ah, sounds like a lot of work, a lot of capital, or you could make a website." Mhm. And I was like, "Guess I'm going to learn how to make a website." And so I literally went on W3 Schools and did their PHP tutorial. How many people here remember W3 Schools? Yeah. Decent number of hands. Um, and I remember the very first thing I built was a table sorting widget, right? I had this picture in my head of what it would be. And I remember the moment that I clicked the column and it sorted according to that column, which was exactly the thing that I wanted. And I was like, that was magic, right? And I was like, this is so cool. Because the thing about math is that you think hard about a problem, you understand it, you write it down in an obscure way, you call it proof. And then like three people will ever care, right? But in programming, you write it down in an obscure way, we call a program. And then maybe only three people ever read that program and care about the code. But everyone gets the benefit. No one has to understand the details. That thing that was in your head, it's real. It's in the world. And I was like, that that's the thing I want to do. Forget about that hundred-year time horizon. I just want to build Uh you do just want to build. Uh it's so you were so good at it that somehow somewhere you got cold emailed by Stripe while you're still in college. That's right. Uh what's the story? How first of all, how did they find you and what was it that convinced you to drop out to join them? Well, so I had mutual friends with all the people at at Stripe, the you know giant company of like three people at the time. uh and uh uh they they had asked you know the usual thing where they'd asked someone at Harvard who the you know people around campus to talk to uh who they might recruit where my name came up they asked the same for the people at MIT because I actually had dropped I' I'd been at Harvard and actually dropped out to go to MIT so I I had the advantage of uh I guess you know uh get getting up votes on both sides. Um, but I remember when I met the Patrick and it was you I just flown in. It was like late at night that you know it was storming and uh I I showed up and we just started talking about code, right? And it was just like one of those moments where you're like this this is the kind of person that that I've wanted to work with and been looking for. Uh and so I ended up dropping out of MIT uh and uh you know flew out and been out here ever since. Yeah. Yeah. Uh we have a spe we have some guest questions sprinkled along the way as you know. Uh, so question from someone named Matthew Brockman. I've heard of him. CTO of Julius AI. When do you think our parents will give up on the dream of you finishing your degree? Maybe maybe Harvard or UND will take you back. Yes. Uh, well, never. Um, it was definitely, you know, I think it was no matter where you're going, if you tell your parents you're leaving Harvard, it's going to be hard. Um, you tell your parents you're leaving school altogether, it's going to be difficult. Um and I think that you know it was actually um to to their credit you know I think even though it was difficult um that they were like that you know we trust you like you you must see something and and understand something from from where you sit that's hard for us to see from from halfway across the country. Um but yeah I think that that as you know did Stripe and uh and had a good time and and actually learned things um and uh turned out as a real company and not just uh uh you know just dropping out doing nothing. I I think that that they they really were were uh you know have have warmed up to it and so um I think they're very proud of you. Yes. Absolutely. So you you were with Stripe from 4 to 250 people as the first CTO eventually. Um one thing I I found recently that Hacker News maybe doesn't know is apparently the call installation only happened like a handful of times. It wasn't like a thing at Stripe. Was that that's I think that's true. Um yeah it is it is the thing that that you know it's like survived the uh the It's an urban legend because it's like so cool. It's like you so customer obsessed. Anyway, so what else do people get wrong about early Stripe? Like why do we want to clear the air? Yeah. Well, I think people don't understand how hard it was, right? It was just like um like I remember um you know, first of all, the the kind of thing that we did a lot of is that we added all of our customers on Ghat. And so it was very much the case that we were in constant contact with them. And so even if you're not literally sitting over their their shoulder, you're doing the next best thing. Um, but I remember um like one I you know one one one day we realized that I you know the the the payment back end that we were on it just wasn't going to scale. Uh we absolutely needed to be on Wells Fargo and we got sort of the deal done but now we need to do a technical integration. And they said well this technical integration is going to take like 9 months because that's how long it takes. And we're like that's crazy. Like you're a startup. Like we can't sit around waiting 9 months to get this thing done. Um and so actually in 24 hours uh we completed it uh by just basically treating it like a college problem set. Uh and it was you know I I was implementing everything. John was working from the top of this test script and testing everything and being like this is broken. Daryl was starting from the bottom and working his way up. And uh in the morning we got on with with the uh certifying person and we sent some some test messages and there was an error and the person's like all right I'll see you next week. Um because that's how all their customers operate, right? there's an error like you know clear you need to send it to your dev team and we were like no no no there must just be like like some sort of glitch in the system like and we just Patrick was just like talking to keep her on the line and frantically like I was there editing the code and so we got like five turns in uh and we actually failed uh but fortunately she was nice enough to reschedule two two hours later uh and then we passed and so you realize that was like six weeks worth of normal dev work that you got done in that moment because you didn't just accept the like arbitrary constraints of how other organizations would work. Yeah. Yeah. Do I think there's a do you think there's a lot more opportunity like that in most jobs? Like how do you how do you advise other people to be that I guess fast or like to cut that many cycles? Yes. I mean I think that I the way I think about it is that if you think from first principles you can find where things need to be slow or done the way that they're normally done or whatever those things are those exist right the general principle of ah just don't worry about the constraints and just do the thing. Um, I think that that that is not 100% true. I think it's really about mapping to where is there unnecessary overhead that's there for constraints that are no longer applicable that that don't apply uh to your specific circumstance. And I think this is especially true in this world that we're in now with AI that's accelerating productivity so much. Yeah. Just fire off a codeex. Why not, right? Um, one thing one thing one last thing about your sort of pre-openi life was independent study. I just I I found that just it's a recurrent theme from high school. You did recenter. I did. Um and your sbatical as well. So you've just done it repeatedly. What makes independent study effective? Like I think there's a lot of people who don't do a good job of it and kind of waste a year. What what what do you do that makes it so effective? Well, I think it was a key part of how I grew up. um you know in in uh in sixth grade my dad taught me algebra and in seventh grade showed up at the high school as the first time that you you track into advanced math pre-alggebra and we went to the teacher like can he skip uh this and go directly to the the eighth year the eighth grade course and the teacher looked at my mom and me very condescendingly and was like every parent believes that their child is special and after like a month of being in this teacher's class and you know I was paying no attention and just doing you know calcul calculator games in in the back and she'd try to trip me up and, you know, call me to answer questions from the whiteboard and I would just get them all right. She was like, "All right, like fair enough. Uh, your your child should be uh in the next year." Um, and but then when I was in eighth grade, there was no more math left in my middle school. I didn't have a car, so I had to do online courses. And in that one year, I ended up doing three years worth of high school math. And so I think that for me a lot of it is about suddenly these if you're if you're excited about something independently it's something you want to do that you can break the constraints there as well. Uh you can do three years of math in one year and then it compounds because the next year I was at my high school finished math there and then all through 10th 11th 12th grade I I had you know no more math so I did have a car and I was able to go to University of North Dakota take whatever classes I wanted there. And so I think that that that kind of compounded compounded compounded to learning programming. And then I think that that the way I learned program is very much self-study just building things and and experiencing things out in the world. And so I think that the thing I would just advise is like if you have an opportunity to explore and you have a passion, you're actually enjoying it, just go deep, right? And by the way, it's not always fun, right? I think that it is very easy to uh get kind of you know sort of feel like uh I got kind of bored but if you just push through those hurdles then I think that the that the reward is worth it. Yeah. You self-studied machine learning too like that was a whole period of your life. Um any particular highlights from there? It sounds like you talked to Jeff Hinton at one time. I did talk to Jeff Hinton. Yeah. Yes. And like was you know did that help or what was the most helpful thing like you became a machine learning practitioner? Well, so so when I when I started out, so you know, I'd been I'd been at Stripe. I was reading hacker news post about deep learning and yeah, it was like, you know, there's a deep learning for ACT like every day it felt like and this was, you know, 2013, 2014 and I was like, what is deep learning? and I knew like one person in the field and so I talked to them, they introduced me to some more people and then they introduced me to more people and the thing that surprised me was I kept getting introduced to a bunch of my smartest friends from college and I was like that's interesting. All of these people ended up in this field like what's going on and I started to realize that that there was something real that was building right that was being developed that people were really making these systems do material new things that computers were not able to do before. And I was like that that is the thing. Um and so after I left Stripe, you know, I knew I wanted to do something in AI. Um start an AI company, but I didn't really know how to contribute, what my skills would be useful for. And uh so I was in New York and I was like, you know what, I'll build a GPU rig and see if I can do some Kaggle competitions. And so I went on Newegg and just like, you know, bought some uh some Titan X cards. And uh it was really cool, you know, physically assembling this machine. And uh you can find some some tweet from from 2015 when I powered it on. You see all this like green and all the fans going and I was like this this is what computers are meant to be. Uh I think many folks in the audience have that experience as well. Um awesome. Okay. So what convinced you that AGI was possible? Like you you had a point where you were sort of disillusioned with it. You wrote you tried to write a chatbot. You didn't it didn't work. But what made you go all in on it? Yeah. Well, so you know, part of part of the journey for me was reading Alan Touring's 1950 paper, Computing Machinery, and Intelligence. This is the Touring test paper. How many people have read it? Read it. You fewer hands than than W3 schools. Uh but equally as important, uh worth reading. Uh the thing that is so fascinating to me is he lays out in the beginning, okay, Turing test, this idea of just does a machine think? Is it intelligent? And you can say it's intelligent if you know a human can't tell the difference between talking to it and talking to a human. Fine. But the thing that was that has not really become as embedded in the pop culture, but to me was so astounding was he said, "Well, how are you going to program an answer to this? You will never be able to write down all the rules. But what if you could build a child machine that learns like a human child and then you just apply rewards and punishments and boom, it's going to uh it's going to to be able to to pass the test." And I was like that that is the kind of technology that we have to build because as a programmer you have to understand everything. You have to understand the rules of how to solve the problem. But what if the machine can understand things and solve problems that you yourself cannot understand. Like that feels fundamental, right? That feels like how you actually solve problems that are important to humanity. And I this was you know 20 2008 or so that I read this and I went to my professor and uh who was an NLP professor and I asked if I could do some research with him and he said yeah here are some pars trees and I was like okay this is not what Turing was talking about. Yeah. Um this is like word nets and the whole thing. Exactly. So it's like you you know definitely a little bit of trough of sorrow there. Um, but with deep learning, the thing about deep learning that's magic is that, you know, it really started in to show show promising results 2012 with with AlexNet, right? And and that it just blew everyone out of the water in the imageet competition. And so suddenly you have this like general learning machine. You know, it's got a little bit of a prior in there of of of convolutions, but it's better than 40 years worth of computer vision research, right? People trying to write down all the rules as well as possible. And then people are like, well, okay, it works in vision, but it's never going to work in my field. It's never going to work in machine translation, never going to work in uh in, you know, in NLP, never going to work in this or that. And suddenly it starts being the best in all of those areas. Suddenly the walls between these departments are being torn down and you're like that that is what Terraring was talking about. And so I think for me just seeing the the type signature of this technology and by the way this technology is not new, right? neural nets were really like if you go back and read the uh the Mcculla Pitts uh neuron paper from like 1943 or so um I told people I told him to give homework to people. Okay. Yeah, there you go. Yes. Classes assigned. Um the there the the images in there they look just like the kinds of images that you see now of just like you know layers of neurons and things like that. And so you just realize there's something deeply fundamental about what we're doing. And uh you can find these these uh you can find this paper um from 199 the 1990s talking about what caused the deep learning winters and that it was these neural net people. They have no new ideas. They just want to build bigger computers. And I'm like yes that's what we need to do. Um and so I think that all of this together just feels like we are we are to some extent continuing this wave this 70year history. Um and in many ways um you know the whole computing industry has been really trying to build up to the point that you can have machines that are able to perform the kinds of tasks that we're just starting to scratch the surface to solve new problems that humans cannot to be be assistive to us in our daily lives to not have to you know be typing with our with our you know meat sticks but instead to have something that you can interact with just like a person where the machine comes much closer to you rather than you closer to it and having to learn assembly language or you know whatever it is. Um and so to me it felt like all of the factors were lined up and now we just need to build. Yeah. Um I I like that consistent theme that you keep coming back to. We just need to build. Um so in 2022 you wrote that it's time to be an ML engineer. Actually I have a personal friend uh who read that post and cold emailed you and joined OpenAI and all that. Um you said that great engineers are able to contribute at the same level as great researchers to future progress. Is that uh is that still true today? You know, I think a lot of engineers look at the researchers who are making millions of dollars and they're like, how do I contribute as much? You know, I I think it's absolutely if not even more true. Um I think that like if you look at the phases of deep learning research since 2012, I think at the beginning it really was um and this is kind of what I expected when we started OpenAI, you know, just like research scientists who had gotten a PhD who would go and kind of come up with ideas and test them out. And you know there's there's engineering to be done. If you actually look at Alexet itself, you know, it's fundamentally the engineering of let's get fast convolutional kernels on a GPU. Um and and uh fun fun fact is people who were in the lab with Alexi at the time uh were actually felt very bad for him because they were like he has some fast com kernels for uh uh for you know some some image data set that doesn't really matter but you know Ilia was like well clearly we just need to apply this to imageet. It's going to be great right? Right? So it's like the combination of great engineering together with the idea of what to do with it, right? That that's what what makes the magic work. Um and uh the thing that I think is still true and even more true is okay, so the engineering required, it's now not just let's build some kernels, but let's build a system. Let's actually scale to 100,000 GPUs. Let's actually, you know, sort of do this crazy RL system that orchestrates things in all sorts of ways. Um, so the idea, if you don't have the idea, you're dead in the water. There's nothing to do. But if you don't have the engineering, that idea is not going to it's not going to live and see the light of day. And so you need to have both of these coming together harmoniously. Yeah. I think that Ilia Alex relationship is really emblematic of like the research engineering partnership that now is the philosophy at OpenAI. That's right. Yeah. Yeah. And if you look at how open AI operates like I think from the very beginning we had this ethos of engineering and research be valued um and and work together um as partners and I think that that is something that we you know it's like something that we we really work at every day. Yeah. Uh it's my explicit goal to try to throw uh curveballs in this in this stuff. So uh in terms of the relationship between engineering and research, what did OpenAI do wrong in the early days that you do well now? Um well I think that the relationship between engineering and research the way I think about it is you never fully solve it right you just sort of solve the current level of problem and then you move on to the next level of sophistication and I noticed that actually the kinds of problems that we ran into were basically the same problems that had been run into at every other lab and it was just like you know either we would be further along or that there' be a slightly different variant of it and so I think there's something deeply fundamental about this um so the the ve at the very beginning I could really see people who came from the engineering world, people came from the research world, just sort of thinking about system constraints very differently. And so as an engineer, you're like, hey, if I've got an interface, you should not care what's behind that interface. We agreed on the interface, I can implement however I want. Whereas if you're a researcher, you're like, if there's a bug anywhere in the system, all I'm going to get is just slightly degraded performance. Not going to get an exception, not going to get indications of where. And so I am responsible for understanding everything. the interfaces they don't matter unless they're like truly rock solid and I can just like never think about it which is a pretty high bar. um then I am actually responsible for for this code and that causes friction right because then how do you actually work together and I saw a project very early on where that you know the the people from the engineering background would write the code and then there'd be this big debate over every single line and I was just like this is never going to move it's going to be so slow and instead the way that we ended up proceeding was um so I actually worked on that directly and I'd come up with like five ideas at a time someone from the research side would say these four are bad I'd be like great that's all I wanted Right. And so the value that I think we've really realized is critical and that I tell people from from the engineering world coming into OpenAI um is technical humility. Right? It's like you're coming in because you have skills that are important. But it's a totally different environment from you know something like a traditional web startup and figuring out when those intuitions apply and figuring out like when to leave them at the door is super hard. And so the most important thing is to like come in really really listen and kind of assume that that that there's something that you're missing until you deeply understand the why and then at that point great make the change like change the the the architecture change the abstractions. Um but I think that that kind of approach of just really really read and listen and understand with that humility um that that is I think a really key determiner. Yeah. Awesome. Um we're going to tell some stories from recent launches of OpenAI, the greatest hits. Uh so one of the things that is is kind of interesting is just scaling in general. Everything breaks at different orders of magnitude. So in when chatbt launched you got a million users in 5 days. This year when 40 IG gen launched, you got 100 million users in 5 days. How do those two periods compare? Uh they echo very similarly in a lot of ways. You know, the thing about chatbt, uh, it was supposed to be a low-key research preview and we put it out very, you know, sort of chilly and then suddenly everything was down and we, you know, we kind of anticipated that chat GBT would be a very popular thing, but we thought that GPT4 would be necessary to get it. Had it internally as well, so you just weren't impressed by Exactly. Right. It's like you, that's the other thing about this field is you update so quickly, right? It's like you see magic and you're like this is the most amazing thing I've ever seen and then you're like well why can't it like you know why why can't it like merge you know 10 PRs for me. Exactly. Um and the image gen moment was very similar in terms of it was just so so loved and so popular and it just went viral in in ways that you know just like the numbers were just off the charts. And so internally we actually did something that we really really try not to do um which is we pulled a bunch of compute from research for both of these launches actually um because that's mortgaging the future um to make make the system work um but if you can actually deliver and keep up with demand then of course people get to experience the magic and I think that um that that that's something that is really worthwhile and it's really important to sort of you know maximize those moments. Um, so I think that that that we really have that same ethos of really serving the user, really trying to push for the technology and just do things that are materially new that no one's ever seen before. Um, and then whatever it takes to get those out into the world and make those successful that that's what we do. Amazing. Um, well, I mean, incredible job. U GPT4 launch. So I am told your wife drew the joke website. That's true. Yeah. Fun fun fun Easter egg. My handwriting was so bad uh that even our AI couldn't tell what to do with it. Um so like uh apparently did you improvise some of this? I I I heard I gravine. Yeah, definitely. Definitely like you know usually when I when I do these kinds of demos like I've tested the general shape of them ahead of time. Uh but I've always had like it's very easy in this field to have ones that are just like if you slightly typo a character or something then the demo will not work. I don't like doing those. I like to have some robustness to it. So there's always variation in terms of of what actually ends up get being shown. To me, this was the first time I think the world ever saw vibe coding. Um, it is now a thing. What are your thoughts on vibe coding? Uh, well, I think that vibe coding is amazing as an empowerment mechanism, right? I think it's sort of a representation of what is to come. And I think that the specifics of what vibe coding is, I think that's going to change over time, right? I think that you look at even things like codeex like to some extent I think our vision is that as you start to have agents that really work that you can have not just one copy not just 10 copies but you can have a hundred or thousand or 10,000 or 100 thousand of these things running you're going to want to treat them much more like a co-orker right that you're going to want them off in the cloud doing stuff being able to hook hook up to all sorts of things you're asleep your laptop's closed it should still be working um I think that that the the you know current conception of of vibe coding in an interactive loop. Um, you know, that that's something that I I think is like, you know, it's it's I Okay, so my my prediction of what will happen is like I think there's going to be more and more of that happening, but I think that the agentic stuff is going to also really intercept and overtake. And I think that all of this is just going to result in just way more systems being built. Um, and the thing that that I think is also very interesting is that a lot of the vibe coding kind of demos and and the cool the cool flashy stuff. Um, for example, make making the joke website, it's making an app from scratch. But the thing that I think will really be new and transformative and is starting to really happen is being able to transform existing applications to go deeper. Um, and that be able to, you know, like I think so many companies are sitting on legacy code bases and doing migrations and updating libraries and changing your cobalt language to something else is so hard and is actually just not very fun for humans. And uh, I think we're starting to get AI that are able to really tackle those problems. And so the thing that I love about where Vibe coding started has really been like with the most like just like make cool apps kind of thing. And it's starting to become much more like serious software engineering. And I think that going even deeper to just like making it possible to just move so much faster as a company. Um that's I think where where we're headed. Yep. Uh speaking of codeex, I've heard that you've just it's kind of your baby a little bit. Um and you've started I think on the live stream you were talking a lot about just make things modular and well doumented and all that good stuff. Like how do you think codeex changes the way that we code? Um well I definitely think that that it's an overstatement to say it's it's my baby. like I think that there's um a really incredible team um and and uh that you know I've I've been trying to support them and and and their vision and um but I think that that the direction is something that is like just so um so compelling and incredible to me. Um the way that that uh and sorry could you repeat the the how how does codeex change that we the way that we code? I see. Yeah. The thing that has been most interesting to see has been when you realize that the way you structure your codebase determines how much you can get out of codecs, right? That the if you match the strength of like basically all of our existing code bases are kind of matched to the strengths of humans. But if you match instead to the strengths of models which are sort of very lopsided, right? models are able to handle way more like diversity of stuff but also are not not able to like sort of necessarily connect deep ideas as much as humans are right now. And so what you kind of want to do is make smaller modules that are well tested that have tests that can be run very quickly um and then fill in the details. the model will just do that right and it'll run the test itself and the connections between these different components kind of the architecture diagram like that's actually pretty easy to do and then it's the like filling out all the details that is often very difficult and if you if you actually do that you know what I described also sounds a lot like good software engineering practice um but it's just like sometimes because humans are are capable of holding more of this like conceptual abstraction in our head we just don't do it right that like yeah it's like you know it's a lot of work to write these tests and to you know to flesh them out and that you know the model's going to run like these tests like a hundred times or a thousand times more than you will and so it's going to care like way way more. So in some ways that the direction we want to go is build our code bases for more junior developers um in order to actually get the most out of these models. Um, now it'll be very interesting to see as we increase the model capability, does this particular way of structuring code bases remain constant? And I kind of think that it's a pretty good idea because again, it starts to match what you should be doing for for maintainability for humans. Um, but yeah, I think that to me that the really sort of exciting thing to think about for the future of software engineering is what of our practices that we kind of just cut corners for do we actually really need to bring back in order to get the most out of our systems? Yeah. Um, can you put numbers on like ballpark numbers on the amount of productivity you guys are seeing with codecs internally? Um, I yeah, I don't know what the latest numbers are. I mean, there's definitely double digit percent of our of our PRs are written low low double digit um written entirely by codecs. Um and that's super cool to see. Um but it's also like you know that it's not the only system that we use internally and I think that um to me it's it's still in the very very early days. Um it's been exciting to see some of the external metrics. Um like I think we had 24,000 uh PRs that were merged in like the last day uh in in public GitHub repositories. And so it's just like yeah, this stuff is all just getting started. Yeah, it's doing a lot of work. Uh guest question from Dylan Patel on scaling and uh reliability. Um so as we're doing more tasks that take longer and utilize more GPUs, they're also just unreliable. They fail a lot, right? And and this is just well known. Um so this causes training to fail as well. So like but like you know you you've mentioned that you can sort of just restart a run and that's okay. like how do you deal with this when you have to train long horizon agents, right? Because you can't really restart something that has a trajectory that's kind of halfway that is maybe nondeterministic. Yeah. I mean, I think that there's a bunch of problems that you kind of solve and then you make the models more capable and then you have to resolve them. And so, yeah, when the the rollouts are short, you know, 30 seconds, you kind of don't care that much about this problem. If they're going to be days now, you really care about this problem. Yep. And you have to start thinking about how to snapshot state and a bunch of things like that. Um the short answer is that I think that there's a this like ladder of complexity that you keep climbing with these training systems and it goes from you know like couple years ago all that we cared about was just doing good oldfashioned free training, right? And that's like very checkpointable. Um and even there it's not trivial, right? It's like you know if you go from checkpointing once in a while to like you want to checkpoint every single step now you need to think really hard about about how you're going to avoid copies and blocking and all these things um then for something like these more complicated RL systems there's still checkpoint in terms of you know maybe you care about uh you know checkpointing your cache so you don't have to recmp compute everything um and the nice thing about our systems is that you know language models are their state is very explicit right and it's something that actually can be stored um something you actually can can handle. Whereas if you have tools that you're hooked up to that are themselves stateful, maybe those are not something you can restart and recover from. And so I think that that if you consider the whole system end to end, thinking about what checkpoint ability looks like. And there's also a question of maybe it just doesn't matter, right? Maybe it's fine that you restart the system and you get some little wiggle in your graph, but these models are smart. Yeah. Right. That they can handle it. Um, one thing we're looking at tomorrow that's launching is maybe you can sort of take over the VM and checkpoint the VM state and restart it. Yep. Um, I think we have a dialin call-in question from Paris. Um, if someone can play the video Oh, I wish I could be there to ask you in person. One of the questions that I have is in this new world, the work the workloads in the data center in the in the AI infrastructure is going to be incredibly diverse. on the one hand agents that are doing deep research and they're thinking they're reasoning they're planning and they're working with other agents and they're you know working on a lot of memory they have large context on one hand some of it you also want to think as fast as possible so you know how do you how do you create uh an AI infrastructure that is optimized for workloads that have to that have a lot of prefill a lot of decode a lot of something in between on the one hand and on the other hand uh the type of workloads that I'm super excited about these multimodal vision and speech AIs that are essentially your R2-D2 your companion it's on all the time it's instantly aail available to you and so these two workloads one of the one of them super uh compute intensive and take might take a long time and um uh you know test time scaling and all that on the other hand wants to be very low latency so what does what does a future AI infrastructure look like that's that's as flexible as possible um as performant as possible low latency high throughput you know all of that uh is just incredibly complex so how how you think through that and and what kind of an AI infrastructure would you would you think uh would be ideal going forward well with lot lots of GPUs of So, so if I were to summarize, uh, Jensen wants you to tell him what to build. What would be your dream? Uh, but also like there's just two needs. There's two kinds of infra. There's there's long compute and there's real time. Now, now, now. Yes. Yes. I mean, it's it it is hard, right? Because I mean, this codees problem, it is a mind-boggling one. And so, you know, I'm a software person by by background and that, you know, we think we're we're off here just like writing the software for AGI and then you realize you have to do like these massive infrastructure projects, right? Like that's not how we set out, but it actually kind of makes sense in the end, right? If we're going to build something that's going to be transformative to the world, like yeah, probably it's going to require some some, you know, maybe the biggest physical machines that humanity has ever created, like kind of type checks. Um, so I think that the that there's two answers. Like the naive answer is, okay, yeah, you want two kinds of accelerators. You want one that's really computed, one that's very latency optimized. Um, throw like tons of of HBM on one of those and, you know, ton tons of tons of comput on the other. You're all good. Um, now one thing that's really difficult is predicting the ratios, right? Now you have a new problem you have to think about. And if you get the balance wrong, suddenly you're going to have a whole part of your fleet that's just useless. Yep. And that sounds really scary. Um, now the thing is because the way that these things work is there's no requirements in this field. There's no constraints in this field. there's just sort of this linear program that people are optimizing and so yeah if you give our engineers some sort of misbalance of resources like we will find ways to utilize it maybe at great pain right but an example of this is you know you've seen the whole field move towards mixture of experts and to some extent what mixture of experts is is saying well we have all this DRAM sitting around that isn't being used for anything because the balance is wrong fine we'll fill up with parameters and we'll actually not cost any compute and we'll just get extra ML comput efficiency out of it like boom there you go and so I think that there is some of that where if you get the balance wrong it's actually not the end of the world um homogeneity of accelerators is like a very nice default to start um but I think that that that ending up with purpose-built accelerators is also not super crazy and the more that we move to these world these worlds where it's the just dollars of capex for this infrastructure starts to become so eye watering then starting to hyper optimize for some of these workloads is pretty reasonable um but I think the jury a little bit out because if you think about it that the research is just moving so fast and to some extent that dominates everything else. Um okay I wasn't planning to ask this but you just brought up the research stuff. Can you rank current scaling bottlenecks for GBT6? Ah compute data algorithms power money. Yes. I mean which one's which one's like the you know number one and two? Which one are you are you like most rate limited on? I mean look I think we are in a world where basic research is back. I think that is really amazing, right? There was this period. Yeah, basic research. Um there was a period where it felt like all right, we got a transformer, let's just scale it, you know, and um I find those problems very exciting. I have a lot of fun just like you got a very well- definfined hard problem. You want to just move the number up and to the right. Um but it also is a little intellectually dissatisfying in some ways. It's like that it feels like there's more to life than just, you know, attention is all you need paper, you know, in in in vanilla form. Um and so I think that what we've started to see is that we're operating at a scale now um where we've pushed the compute, we've pushed the data so far that you can start to get you start to have algorithms is like again just back as as a important and really almost a long pole um in in terms of future progress. And so um all of these things they're all they're all important poles of the tent. And you know on any one day uh it might look a little lopsided one way or another. Um but yeah, fundamentally I think it's like you want to keep these all in balance. Um and it's really exciting to see things like like the RL paradigm. That's something that we invested in very deliberately uh for for for multiple years. It was like when we trained GPD4 um the very first thing like I think it was really interesting was when you we talked to GPD4 for the first time we were like is this an AGI? Like it's clearly not an AGI but it's really hard to say why right is like there's something about it. It's so fluid and smooth but but somehow it falls off the rails. is like, well, we got to solve that reliability problem. And you're like, well, it has never actually experienced the world, right? It's like someone who's just read all the books or, you know, sort of read, you know, sort of observed the world, has observed the world, um, and, uh, never experienced it itself, right? It's like, you know, sort of just, you know, watching it through through a pane of glass or something. And, uh, and and that to me is I, you know, was something we were just like, okay, clearly we need a different paradigm. And we just pushed on it until we made it really work. And I think that that remains true today that there's other very clear missing capabilities um that we just need to keep pushing and we will we will get there. Awesome. Um broadening out just from from just broad opening eye things. Um well honestly I'm just going to let So we asked Jensen for one question. He's an overachiever so he sent in two. So let's play a second video. AI native engineers in the audience they are probably thinking um in the coming years your openi will have agis and uh they will be building domain specific agents on top of the agis from openi and so some of the some of the questions that I would have on my mind would be uh how do you think their development workflow would change uh as uh openai's agis become much more capable and yet they would still have um plumbing workflows uh pipelines that they would create flywheels that they would create for their domain specific uh agents These agents would of course be able to reason, plan, use tools, have memory, short-term, long-term memory and um and they'll be amazing amazing agents, but how does it change uh the development process in the coming years? Yeah, I think that this is a really fascinating question, right? I think you can find a wide spectrum of very strongly held opinion that is all mutually contradictory. Um I think my perspective is that first of all, it's all on the table, right? Maybe we reach a world where it's just like the AIs are so capable um that you know we all you know just let let them write all the code. Maybe there's a world where that you have like one AI in the sky. Maybe it's that you actually have a bunch of domain specific agents that require a bunch of of specific work in order to make that make it happen. I think the evidence has really been shifting towards this like menagery of different models. Um and I think that's that's actually really exciting right that there's different inference costs just even from a systems perspective. um that there's different trade-offs like distillation works so well. Um so there's actually a lot of power to be had by models that are actually able to use other models. And so I think that that that is going to open up just a ton of opportunity because you know we're heading to a world where the economy is fundamentally powered by AI. We're not there yet but you can see it right on the horizon. They're working on it all. Exactly. I mean that's what people in this room are building that that is what you are doing. And the the economy is a very big thing. there's a lot of diversity in it and it's also not static right that I think when people think about what AI can do for us um it's very easy to only look at well what are we doing now and how does AI slot in and you know the percentage of human versus AI but that's not the point right the point is how do we get 10x more activity 10x more economic output 10x more benefit to everyone um and I think that the direction we're heading is one where the models will get much more capable there'll be much better fundamental technology and there's just going to be like way more things we want to do with it and the barrier to entry will be lower than ever. And so things like healthcare um that you can't just you know the the it requires responsibility to go in and think about how to do it right. Things like education where there's multiple stakeholders you know the parent the teacher the student um each of these requires domain expertise requires careful thought requires a lot of work. Um and so I think that there is going to be just like so much opportunity for people to build. Um, and so I'm just so excited to see everyone in this room because that's the right kind of energy. Thank you for encouraging us and being an inspiration. Thank you so much. Great everybody. Thank you. [Music]