#define AI Engineer - Greg Brockman, OpenAI (ft. Jensen Huang)

Channel: aiDotEngineer
Published at: 2025-08-10
YouTube video id: avWhreBUYF0
Source: https://www.youtube.com/watch?v=avWhreBUYF0
[Music]
[Applause]
[Music]
Well, hello. Hello. Is uh mic working
for you? Check. Check. Check. One, two,
three. All right. First hard technology
problem of the day down. Yeah. Yeah.
Well, the Wi-Fi is the other one. Um,
everyone here knows. Um, so Greg,
welcome to AI Engineer. Thank you so
much for taking the time. Thank you for
having me. Um, we're going to go a
little bit chronologically and, uh, a
lot of people send in questions and I've
sort of grouped them up for you. So,
we're just get right into it. U, so, you
know, you you know, I did some deep
research on you. uh you started with
deep research with with deep research.
Um I called it Peep research because you
were researching a person. Uh you
actually did theater growing up and
chemistry and math and you wrote a
calendar scheduling app and that's what
got you into coding. But like what
really inspired your love for coding?
Like why why are you the coding guy?
Well the funny thing is I thought I was
going to be a mathematician when I grew
up. Yeah. You know I'd read about people
like Gawwa and Gaus. you know, we work
working on these like hundred, 200, 300
year time horizons. And I was like,
that's what I want to do. If anything
that I come up with is ever used while
I'm still alive, it wasn't long-term
enough. It wasn't abstract enough. Um,
and I was writing this chemistry
textbook after high school, sent it to
one of my friends who' done something
similar in math, and he said, "No one is
going to publish this. You can either
self-publish." I was like, "Ah, sounds
like a lot of work, a lot of capital, or
you could make a website." Mhm. And I
was like, "Guess I'm going to learn how
to make a website." And so I literally
went on W3 Schools and did their PHP
tutorial. How many people here remember
W3 Schools? Yeah. Decent number of
hands. Um, and I remember the very first
thing I built was a table sorting
widget, right? I had this picture in my
head of what it would be. And I remember
the moment that I clicked the column and
it sorted according to that column,
which was exactly the thing that I
wanted. And I was like, that was magic,
right? And I was like, this is so cool.
Because the thing about math is that you
think hard about a problem, you
understand it, you write it down in an
obscure way, you call it proof. And then
like three people will ever care, right?
But
in programming, you write it down in an
obscure way, we call a program. And then
maybe only three people ever read that
program and care about the code. But
everyone gets the benefit. No one has to
understand the details. That thing that
was in your head, it's real. It's in the
world. And I was like, that that's the
thing I want to do. Forget about that
hundred-year time horizon. I just want
to build
Uh you do just want to build. Uh it's so
you were so good at it that somehow
somewhere you got cold emailed by Stripe
while you're still in college. That's
right. Uh what's the story? How first of
all, how did they find you and what was
it that convinced you to drop out to
join them? Well, so I had mutual friends
with all the people at at Stripe, the
you know giant company of like three
people at the time. uh and uh uh they
they had asked you know the usual thing
where they'd asked someone at Harvard
who the you know people around campus to
talk to uh who they might recruit where
my name came up they asked the same for
the people at MIT because I actually had
dropped I' I'd been at Harvard and
actually dropped out to go to MIT so I I
had the advantage of uh I guess you know
uh get getting up votes on both sides.
Um, but I remember when I met the
Patrick and it was you I just flown in.
It was like late at night that you know
it was storming and uh I I showed up and
we just started talking about code,
right? And it was just like one of those
moments where you're like this this is
the kind of person that that I've wanted
to work with and been looking for. Uh
and so I ended up dropping out of MIT uh
and uh you know flew out and been out
here ever since. Yeah. Yeah. Uh we have
a spe we have some guest questions
sprinkled along the way as you know. Uh,
so question from someone named Matthew
Brockman. I've heard of him. CTO of
Julius AI. When do you think our parents
will give up on the dream of you
finishing your degree? Maybe
maybe Harvard or UND will take you back.
Yes. Uh, well, never. Um, it was
definitely, you know, I think it was no
matter where you're going, if you tell
your parents you're leaving Harvard,
it's going to be hard. Um, you tell your
parents you're leaving school
altogether, it's going to be difficult.
Um and I think that you know it was
actually um to to their credit you know
I think even though it was difficult um
that they were like that you know we
trust you like you you must see
something and and understand something
from from where you sit that's hard for
us to see from from halfway across the
country. Um but yeah I think that that
as you know did Stripe and uh and had a
good time and and actually learned
things um and uh turned out as a real
company and not just uh uh you know just
dropping out doing nothing. I I think
that that they they really were were uh
you know have have warmed up to it and
so um
I think they're very proud of you. Yes.
Absolutely. So you you were with Stripe
from 4 to 250 people as the first CTO
eventually. Um one thing I I found
recently that Hacker News maybe doesn't
know is apparently the call installation
only happened like a handful of times.
It wasn't like a thing at Stripe. Was
that that's I think that's true. Um yeah
it is it is the thing that that you know
it's like survived the uh the It's an
urban legend because it's like so cool.
It's like you so customer obsessed.
Anyway, so what else do people get wrong
about early Stripe? Like why do we want
to clear the air? Yeah. Well, I think
people don't understand how hard it was,
right? It was just like um like I
remember um you know, first of all, the
the kind of thing that we did a lot of
is that we added all of our customers on
Ghat. And so it was very much the case
that we were in constant contact with
them. And so even if you're not
literally sitting over their their
shoulder, you're doing the next best
thing. Um, but I remember um like one I
you know one one one day we realized
that I you know the the the payment back
end that we were on it just wasn't going
to scale. Uh we absolutely needed to be
on Wells Fargo and we got sort of the
deal done but now we need to do a
technical integration. And they said
well this technical integration is going
to take like 9 months because that's how
long it takes. And we're like that's
crazy. Like you're a startup. Like we
can't sit around waiting 9 months to get
this thing done. Um and so actually in
24 hours uh we completed it uh by just
basically treating it like a college
problem set. Uh and it was you know I I
was implementing everything. John was
working from the top of this test script
and testing everything and being like
this is broken. Daryl was starting from
the bottom and working his way up. And
uh in the morning we got on with with
the uh certifying person and we sent
some some test messages and there was an
error and the person's like all right
I'll see you next week. Um because
that's how all their customers operate,
right? there's an error like you know
clear you need to send it to your dev
team and we were like no no no there
must just be like like some sort of
glitch in the system like and we just
Patrick was just like talking to keep
her on the line and frantically like I
was there editing the code and so we got
like five turns in uh and we actually
failed uh but fortunately she was nice
enough to reschedule two two hours later
uh and then we passed and so you realize
that was like six weeks worth of normal
dev work that you got done in that
moment because you didn't just accept
the like arbitrary constraints of how
other organizations would work. Yeah.
Yeah. Do I think there's a do you think
there's a lot more opportunity like that
in most jobs? Like how do you how do you
advise other people to be that I guess
fast or like to cut that many cycles?
Yes. I mean I think that I the way I
think about it is that if you think from
first principles you can find where
things need to be slow or done the way
that they're normally done or whatever
those things are those exist right the
general principle of ah just don't worry
about the constraints and just do the
thing. Um, I think that that that is not
100% true. I think it's really about
mapping to where is there unnecessary
overhead that's there for constraints
that are no longer applicable that that
don't apply uh to your specific
circumstance. And I think this is
especially true in this world that we're
in now with AI that's accelerating
productivity so much. Yeah. Just fire
off a codeex. Why not, right? Um, one
thing one thing one last thing about
your sort of pre-openi life was
independent study. I just I I found that
just it's a recurrent theme from high
school. You did recenter. I did. Um and
your sbatical as well. So you've just
done it repeatedly. What makes
independent study effective? Like I
think there's a lot of people who don't
do a good job of it and kind of waste a
year. What what what do you do that
makes it so effective? Well, I think it
was a key part of how I grew up. um you
know in in uh in sixth grade my dad
taught me algebra and in seventh grade
showed up at the high school as the
first time that you you track into
advanced math pre-alggebra and we went
to the teacher like can he skip uh this
and go directly to the the eighth year
the eighth grade course and the teacher
looked at my mom and me very
condescendingly and was like every
parent believes that their child is
special
and after like a month of being in this
teacher's class and you know I was
paying no attention and just doing you
know calcul calculator games in in the
back and she'd try to trip me up and,
you know, call me to answer questions
from the whiteboard and I would just get
them all right. She was like, "All
right, like fair enough. Uh, your your
child should be uh in the next year."
Um, and but then when I was in eighth
grade, there was no more math left in my
middle school. I didn't have a car, so I
had to do online courses. And in that
one year, I ended up doing three years
worth of high school math. And so I
think that for me a lot of it is about
suddenly these if you're if you're
excited about something independently
it's something you want to do that you
can break the constraints there as well.
Uh you can do three years of math in one
year and then it compounds because the
next year I was at my high school
finished math there and then all through
10th 11th 12th grade I I had you know no
more math so I did have a car and I was
able to go to University of North Dakota
take whatever classes I wanted there.
And so I think that that that kind of
compounded compounded compounded to
learning programming. And then I think
that that the way I learned program is
very much self-study just building
things and and experiencing things out
in the world. And so I think that the
thing I would just advise is like if you
have an opportunity to explore and you
have a passion, you're actually enjoying
it, just go deep, right? And by the way,
it's not always fun, right? I think that
it is very easy to uh get kind of you
know sort of feel like uh I got kind of
bored but if you just push through those
hurdles then I think that the that the
reward is worth it. Yeah. You
self-studied machine learning too like
that was a whole period of your life. Um
any particular highlights from there? It
sounds like you talked to Jeff Hinton at
one time. I did talk to Jeff Hinton.
Yeah. Yes. And like was you know did
that help or what was the most helpful
thing like you became a machine learning
practitioner? Well, so so when I when I
started out, so you know, I'd been I'd
been at Stripe. I was reading hacker
news post about deep learning and yeah,
it was like, you know, there's a deep
learning for ACT like every day it felt
like and this was, you know, 2013, 2014
and I was like, what is deep learning?
and I knew like one person in the field
and so I talked to them, they introduced
me to some more people and then they
introduced me to more people and the
thing that surprised me was I kept
getting introduced to a bunch of my
smartest friends from college and I was
like that's interesting. All of these
people ended up in this field like
what's going on and I started to realize
that that there was something real that
was building right that was being
developed that people were really making
these systems do material new things
that computers were not able to do
before. And I was like that that is the
thing. Um and so after I left Stripe,
you know, I knew I wanted to do
something in AI. Um start an AI company,
but I didn't really know how to
contribute, what my skills would be
useful for. And uh so I was in New York
and I was like, you know what, I'll
build a GPU rig and see if I can do some
Kaggle competitions. And so I went on
Newegg and just like, you know, bought
some uh some Titan X cards. And uh it
was really cool, you know, physically
assembling this machine. And uh you can
find some some tweet from from 2015 when
I powered it on. You see all this like
green and all the fans going and I was
like this this is what computers are
meant to be.
Uh I think many folks in the audience
have that experience as well. Um
awesome. Okay. So what convinced you
that AGI was possible? Like you you had
a point where you were sort of
disillusioned with it. You wrote you
tried to write a chatbot. You didn't it
didn't work. But what made you go all in
on it? Yeah. Well, so you know, part of
part of the journey for me was reading
Alan Touring's 1950 paper, Computing
Machinery, and Intelligence. This is the
Touring test paper. How many people have
read it? Read it. You fewer hands than
than W3 schools. Uh but equally as
important, uh worth reading. Uh the
thing that is so fascinating to me is he
lays out in the beginning, okay, Turing
test, this idea of just does a machine
think? Is it intelligent? And you can
say it's intelligent if you know a human
can't tell the difference between
talking to it and talking to a human.
Fine. But the thing that was that has
not really become as embedded in the pop
culture, but to me was so astounding was
he said, "Well, how are you going to
program an answer to this? You will
never be able to write down all the
rules. But what if you could build a
child machine that learns like a human
child and then you just apply rewards
and punishments and boom, it's going to
uh it's going to to be able to to pass
the test." And I was like that that is
the kind of technology that we have to
build because as a programmer you have
to understand everything. You have to
understand the rules of how to solve the
problem. But what if the machine can
understand things and solve problems
that you yourself cannot understand.
Like that feels fundamental, right? That
feels like how you actually solve
problems that are important to humanity.
And I this was you know 20 2008 or so
that I read this and I went to my
professor and uh who was an NLP
professor and I asked if I could do some
research with him and he said yeah here
are some pars trees and I was like okay
this is not what Turing was talking
about. Yeah. Um this is like word nets
and the whole thing. Exactly. So it's
like you you know definitely a little
bit of trough of sorrow there. Um, but
with deep learning, the thing about deep
learning that's magic is that, you know,
it really started in to show show
promising results 2012 with with
AlexNet, right? And and that it just
blew everyone out of the water in the
imageet competition. And so suddenly you
have this like general learning machine.
You know, it's got a little bit of a
prior in there of of of convolutions,
but it's better than 40 years worth of
computer vision research, right? People
trying to write down all the rules as
well as possible. And then people are
like, well, okay, it works in vision,
but it's never going to work in my
field. It's never going to work in
machine translation, never going to work
in uh in, you know, in NLP, never going
to work in this or that. And suddenly it
starts being the best in all of those
areas. Suddenly the walls between these
departments are being torn down and
you're like that that is what Terraring
was talking about. And so I think for me
just seeing the the type signature of
this technology and by the way this
technology is not new, right? neural
nets were really like if you go back and
read the uh the Mcculla Pitts uh neuron
paper from like 1943 or so um I told
people I told him to give homework to
people. Okay. Yeah, there you go. Yes.
Classes assigned. Um
the there the the images in there they
look just like the kinds of images that
you see now of just like you know layers
of neurons and things like that. And so
you just realize there's something
deeply fundamental about what we're
doing. And uh you can find these these
uh you can find this paper um from 199
the 1990s talking about what caused the
deep learning winters and that it was
these neural net people. They have no
new ideas. They just want to build
bigger computers. And I'm like yes
that's what we need to do. Um and so I
think that all of this together just
feels like we are we are to some extent
continuing this wave this 70year
history. Um and in many ways um you know
the whole computing industry has been
really trying to build up to the point
that you can have machines that are able
to perform the kinds of tasks that we're
just starting to scratch the surface to
solve new problems that humans cannot to
be be assistive to us in our daily lives
to not have to you know be typing with
our with our you know meat sticks but
instead to have something that you can
interact with just like a person where
the machine comes much closer to you
rather than you closer to it and having
to learn assembly language or you know
whatever it is. Um and so to me it felt
like all of the factors were lined up
and now we just need to build. Yeah. Um
I I like that consistent theme that you
keep coming back to. We just need to
build. Um so in 2022 you wrote that it's
time to be an ML engineer. Actually I
have a personal friend uh who read that
post and cold emailed you and joined
OpenAI and all that. Um you said that
great engineers are able to contribute
at the same level as great researchers
to future progress. Is that uh is that
still true today? You know, I think a
lot of engineers look at the researchers
who are making millions of dollars and
they're like, how do I contribute as
much? You know, I I think it's
absolutely if not even more true. Um I
think that like if you look at the
phases of deep learning research since
2012, I think at the beginning it really
was um and this is kind of what I
expected when we started OpenAI, you
know, just like research scientists who
had gotten a PhD who would go and kind
of come up with ideas and test them out.
And you know there's there's engineering
to be done. If you actually look at
Alexet itself, you know, it's
fundamentally the engineering of let's
get fast convolutional kernels on a GPU.
Um and and uh fun fun fact is people who
were in the lab with Alexi at the time
uh were actually felt very bad for him
because they were like he has some fast
com kernels for uh uh for you know some
some image data set that doesn't really
matter but you know Ilia was like well
clearly we just need to apply this to
imageet. It's going to be great right?
Right? So it's like the combination of
great engineering together with the idea
of what to do with it, right? That
that's what what makes the magic work.
Um and uh the thing that I think is
still true and even more true is okay,
so the engineering required, it's now
not just let's build some kernels, but
let's build a system. Let's actually
scale to 100,000 GPUs. Let's actually,
you know, sort of do this crazy RL
system that orchestrates things in all
sorts of ways. Um, so the idea, if you
don't have the idea, you're dead in the
water. There's nothing to do. But if you
don't have the engineering, that idea is
not going to it's not going to live and
see the light of day. And so you need to
have both of these coming together
harmoniously. Yeah. I think that Ilia
Alex relationship is really emblematic
of like the research engineering
partnership that now is the philosophy
at OpenAI. That's right. Yeah. Yeah. And
if you look at how open AI operates like
I think from the very beginning we had
this ethos of engineering and research
be valued um and and work together um as
partners and I think that that is
something that we you know it's like
something that we we really work at
every day. Yeah. Uh it's my explicit
goal to try to throw uh curveballs in
this in this stuff. So uh in terms of
the relationship between engineering and
research, what did OpenAI do wrong in
the early days that you do well now? Um
well I think that the relationship
between engineering and research the way
I think about it is you never fully
solve it right you just sort of solve
the current level of problem and then
you move on to the next level of
sophistication and I noticed that
actually the kinds of problems that we
ran into were basically the same
problems that had been run into at every
other lab and it was just like you know
either we would be further along or that
there' be a slightly different variant
of it and so I think there's something
deeply fundamental about this um so the
the ve at the very beginning I could
really see people who came from the
engineering world, people came from the
research world, just sort of thinking
about system constraints very
differently. And so as an engineer,
you're like, hey, if I've got an
interface, you should not care what's
behind that interface. We agreed on the
interface, I can implement however I
want. Whereas if you're a researcher,
you're like, if there's a bug anywhere
in the system, all I'm going to get is
just slightly degraded performance. Not
going to get an exception, not going to
get indications of where. And so I am
responsible for understanding
everything. the interfaces they don't
matter unless they're like truly rock
solid and I can just like never think
about it which is a pretty high bar. um
then I am actually responsible for for
this code and that causes friction right
because then how do you actually work
together and I saw a project very early
on where that you know the the people
from the engineering background would
write the code and then there'd be this
big debate over every single line and I
was just like this is never going to
move it's going to be so slow and
instead the way that we ended up
proceeding was um so I actually worked
on that directly and I'd come up with
like five ideas at a time someone from
the research side would say these four
are bad I'd be like great that's all I
wanted Right. And so the value that I
think we've really realized is critical
and that I tell people from from the
engineering world coming into OpenAI um
is technical humility. Right? It's like
you're coming in because you have skills
that are important. But it's a totally
different environment from you know
something like a traditional web startup
and figuring out when those intuitions
apply and figuring out like when to
leave them at the door is super hard.
And so the most important thing is to
like come in really really listen and
kind of assume that that that there's
something that you're missing until you
deeply understand the why and then at
that point great make the change like
change the the the architecture change
the abstractions. Um but I think that
that kind of approach of just really
really read and listen and understand
with that humility um that that is I
think a really key determiner. Yeah.
Awesome. Um we're going to tell some
stories from recent launches of OpenAI,
the greatest hits. Uh so one of the
things that is is kind of interesting is
just scaling in general. Everything
breaks at different orders of magnitude.
So in when chatbt launched you got a
million users in 5 days. This year when
40 IG gen launched, you got 100 million
users in 5 days. How do those two
periods compare? Uh they echo very
similarly in a lot of ways. You know,
the thing about chatbt, uh, it was
supposed to be a low-key research
preview and we put it out very, you
know, sort of chilly and then suddenly
everything was down and we, you know, we
kind of anticipated that chat GBT would
be a very popular thing, but we thought
that GPT4 would be necessary to get it.
Had it internally as well, so you just
weren't impressed by Exactly. Right.
It's like you, that's the other thing
about this field is you update so
quickly, right? It's like you see magic
and you're like this is the most amazing
thing I've ever seen and then you're
like well why can't it like you know why
why can't it like merge you know 10 PRs
for me. Exactly. Um and the image gen
moment was very similar in terms of it
was just so so loved and so popular and
it just went viral in in ways that you
know just like the numbers were just off
the charts. And so internally we
actually did something that we really
really try not to do um which is we
pulled a bunch of compute from research
for both of these launches actually um
because that's mortgaging the future um
to make make the system work um but if
you can actually deliver and keep up
with demand then of course people get to
experience the magic and I think that um
that that that's something that is
really worthwhile and it's really
important to sort of you know maximize
those moments. Um, so I think that that
that we really have that same ethos of
really serving the user, really trying
to push for the technology and just do
things that are materially new that no
one's ever seen before. Um, and then
whatever it takes to get those out into
the world and make those successful that
that's what we do. Amazing. Um, well, I
mean, incredible job. U GPT4 launch. So
I am told your wife drew the joke
website. That's true. Yeah. Fun fun fun
Easter egg. My handwriting was so bad uh
that even our AI couldn't tell what to
do with it. Um so like uh apparently did
you improvise some of this? I I I heard
I gravine. Yeah, definitely. Definitely
like you know usually when I when I do
these kinds of demos like I've tested
the general shape of them ahead of time.
Uh but I've always had like it's very
easy in this field to have ones that are
just like if you slightly typo a
character or something then the demo
will not work. I don't like doing those.
I like to have some robustness to it. So
there's always variation in terms of of
what actually ends up get being shown.
To me, this was the first time I think
the world ever saw vibe coding. Um, it
is now a thing. What are your thoughts
on vibe coding? Uh, well, I think that
vibe coding is amazing as an empowerment
mechanism, right? I think it's sort of a
representation of what is to come. And I
think that the specifics of what vibe
coding is, I think that's going to
change over time, right? I think that
you look at even things like codeex like
to some extent I think our vision is
that as you start to have agents that
really work that you can have not just
one copy not just 10 copies but you can
have a hundred or thousand or 10,000 or
100 thousand of these things running
you're going to want to treat them much
more like a co-orker right that you're
going to want them off in the cloud
doing stuff being able to hook hook up
to all sorts of things you're asleep
your laptop's closed it should still be
working um I think that that the the you
know current conception of of vibe
coding in an interactive loop. Um, you
know, that that's something that I I
think is like, you know, it's it's I
Okay, so my my prediction of what will
happen is like I think there's going to
be more and more of that happening, but
I think that the agentic stuff is going
to also really intercept and overtake.
And I think that all of this is just
going to result in just way more systems
being built. Um, and the thing that that
I think is also very interesting is that
a lot of the vibe coding kind of demos
and and the cool the cool flashy stuff.
Um, for example, make making the joke
website, it's making an app from
scratch. But the thing that I think will
really be new and transformative and is
starting to really happen is being able
to transform existing applications to go
deeper. Um, and that be able to, you
know, like I think so many companies are
sitting on legacy code bases and doing
migrations and updating libraries and
changing your cobalt language to
something else is so hard and is
actually just not very fun for humans.
And uh, I think we're starting to get AI
that are able to really tackle those
problems. And so the thing that I love
about where Vibe coding started has
really been like with the most like just
like make cool apps kind of thing. And
it's starting to become much more like
serious software engineering. And I
think that going even deeper to just
like making it possible to just move so
much faster as a company. Um that's I
think where where we're headed. Yep. Uh
speaking of codeex, I've heard that
you've just it's kind of your baby a
little bit. Um and you've started I
think on the live stream you were
talking a lot about just make things
modular and well doumented and all that
good stuff. Like how do you think codeex
changes the way that we code? Um well I
definitely think that that it's an
overstatement to say it's it's my baby.
like I think that there's um a really
incredible team um and and uh that you
know I've I've been trying to support
them and and and their vision and um but
I think that that the direction is
something that is like just so um so
compelling and incredible to me. Um the
way that that uh and sorry could you
repeat the the how how does codeex
change that we the way that we code? I
see. Yeah. The thing that has been most
interesting to see has been when you
realize that the way you structure your
codebase
determines how much you can get out of
codecs, right? That the if you match the
strength of like basically all of our
existing code bases are kind of matched
to the strengths of humans. But if you
match instead to the strengths of models
which are sort of very lopsided, right?
models are able to handle way more like
diversity of stuff but also are not not
able to like sort of necessarily connect
deep ideas as much as humans are right
now. And so what you kind of want to do
is make smaller modules that are well
tested that have tests that can be run
very quickly um and then fill in the
details. the model will just do that
right and it'll run the test itself and
the connections between these different
components kind of the architecture
diagram like that's actually pretty easy
to do and then it's the like filling out
all the details that is often very
difficult and if you if you actually do
that you know what I described also
sounds a lot like good software
engineering practice um but it's just
like sometimes because humans are are
capable of holding more of this like
conceptual abstraction in our head we
just don't do it right that like yeah
it's like you know it's a lot of work to
write these tests and to you know to
flesh them out and that you know the
model's going to run like these tests
like a hundred times or a thousand times
more than you will and so it's going to
care like way way more. So in some ways
that the direction we want to go is
build our code bases for more junior
developers um in order to actually get
the most out of these models. Um, now
it'll be very interesting to see as we
increase the model capability, does this
particular way of structuring code bases
remain constant? And I kind of think
that it's a pretty good idea because
again, it starts to match what you
should be doing for for maintainability
for humans. Um, but yeah, I think that
to me that the really sort of exciting
thing to think about for the future of
software engineering is what of our
practices that we kind of just cut
corners for do we actually really need
to bring back in order to get the most
out of our systems? Yeah. Um, can you
put numbers on like ballpark numbers on
the amount of productivity you guys are
seeing with codecs internally? Um, I
yeah, I don't know what the latest
numbers are. I mean, there's definitely
double digit percent of our of our PRs
are written low low double digit um
written entirely by codecs. Um and
that's super cool to see. Um but it's
also like you know that it's not the
only system that we use internally and I
think that um to me it's it's still in
the very very early days. Um it's been
exciting to see some of the external
metrics. Um like I think we had 24,000
uh PRs that were merged in like the last
day uh in in public GitHub repositories.
And so it's just like yeah, this stuff
is all just getting started. Yeah, it's
doing a lot of work. Uh guest question
from Dylan Patel on scaling and uh
reliability. Um so as we're doing more
tasks that take longer and utilize more
GPUs, they're also just unreliable. They
fail a lot, right? And and this is just
well known. Um so this causes training
to fail as well. So like but like you
know you you've mentioned that you can
sort of just restart a run and that's
okay. like how do you deal with this
when you have to train long horizon
agents, right? Because you can't really
restart something that has a trajectory
that's kind of halfway that is maybe
nondeterministic. Yeah. I mean, I think
that there's a bunch of problems that
you kind of solve and then you make the
models more capable and then you have to
resolve them. And so, yeah, when the the
rollouts are short, you know, 30
seconds, you kind of don't care that
much about this problem. If they're
going to be days now, you really care
about this problem. Yep. And you have to
start thinking about how to snapshot
state and a bunch of things like that.
Um the short answer is that I think that
there's a this like ladder of complexity
that you keep climbing with these
training systems and it goes from you
know like couple years ago all that we
cared about was just doing good
oldfashioned free training, right? And
that's like very checkpointable. Um and
even there it's not trivial, right? It's
like you know if you go from
checkpointing once in a while to like
you want to checkpoint every single step
now you need to think really hard about
about how you're going to avoid copies
and blocking and all these things um
then for something like these more
complicated RL systems there's still
checkpoint in terms of you know maybe
you care about uh you know checkpointing
your cache so you don't have to recmp
compute everything um and the nice thing
about our systems is that you know
language models are their state is very
explicit right and it's something that
actually can be stored um something you
actually can can handle. Whereas if you
have tools that you're hooked up to that
are themselves stateful, maybe those are
not something you can restart and
recover from. And so I think that that
if you consider the whole system end to
end, thinking about what checkpoint
ability looks like. And there's also a
question of maybe it just doesn't
matter, right? Maybe it's fine that you
restart the system and you get some
little wiggle in your graph, but these
models are smart. Yeah. Right. That they
can handle it. Um, one thing we're
looking at tomorrow that's launching is
maybe you can sort of take over the VM
and checkpoint the VM state and restart
it. Yep. Um, I think we have a dialin
call-in question from Paris. Um, if
someone can play the video
Oh, I wish I could be there to ask you
in person. One of the questions that I
have is in this new world, the work the
workloads in the data center in the in
the AI infrastructure is going to be
incredibly diverse. on the one hand
agents that are doing deep research and
they're thinking they're reasoning
they're planning and they're working
with other agents and they're you know
working on a lot of memory they have
large context on one hand some of it you
also want to think as fast as possible
so you know how do you how do you create
uh an AI infrastructure that is
optimized for workloads that have to
that have a lot of prefill a lot of
decode a lot of something in between on
the one hand and on the other hand uh
the type of workloads that I'm super
excited about these multimodal vision
and speech AIs that are essentially your
R2-D2 your companion it's on all the
time it's instantly aail available to
you and so these two workloads one of
the one of them super uh compute
intensive and take might take a long
time and um uh you know test time
scaling and all that on the other hand
wants to be very low latency so what
does what does a future AI
infrastructure look like that's that's
as flexible as possible um as performant
as possible low latency high throughput
you know all of that uh is just
incredibly complex so how how you think
through that and and what kind of an AI
infrastructure would you would you think
uh would be ideal going forward
well with lot lots of GPUs of
So, so if I were to summarize, uh,
Jensen wants you to tell him what to
build.
What would be your dream? Uh, but also
like there's just two needs. There's two
kinds of infra. There's there's long
compute and there's real time. Now, now,
now. Yes. Yes. I mean, it's it it is
hard, right? Because I mean, this codees
problem, it is a mind-boggling one. And
so, you know, I'm a software person by
by background and that, you know, we
think we're we're off here just like
writing the software for AGI and then
you realize you have to do like these
massive infrastructure projects, right?
Like that's not how we set out, but it
actually kind of makes sense in the end,
right? If we're going to build something
that's going to be transformative to the
world, like yeah, probably it's going to
require some some, you know, maybe the
biggest physical machines that humanity
has ever created, like kind of type
checks. Um, so I think that the that
there's two answers. Like the naive
answer is, okay, yeah, you want two
kinds of accelerators. You want one
that's really computed, one that's very
latency optimized. Um, throw like tons
of of HBM on one of those and, you know,
ton tons of tons of comput on the other.
You're all good. Um, now one thing
that's really difficult is predicting
the ratios, right? Now you have a new
problem you have to think about. And if
you get the balance wrong, suddenly
you're going to have a whole part of
your fleet that's just useless. Yep. And
that sounds really scary. Um, now the
thing is because the way that these
things work is there's no requirements
in this field. There's no constraints in
this field. there's just sort of this
linear program that people are
optimizing and so yeah if you give our
engineers some sort of misbalance of
resources like we will find ways to
utilize it maybe at great pain right but
an example of this is you know you've
seen the whole field move towards
mixture of experts and to some extent
what mixture of experts is is saying
well we have all this DRAM sitting
around that isn't being used for
anything because the balance is wrong
fine we'll fill up with parameters and
we'll actually not cost any compute and
we'll just get extra ML comput
efficiency out of it like boom there you
go and so I think that there is some of
that where if you get the balance wrong
it's actually not the end of the world
um homogeneity of accelerators is like a
very nice default to start um but I
think that that that ending up with
purpose-built accelerators is also not
super crazy and the more that we move to
these world these worlds where it's the
just dollars of capex for this
infrastructure starts to become so eye
watering then starting to hyper optimize
for some of these workloads is pretty
reasonable um but I think the jury a
little bit out because if you think
about it that the research is just
moving so fast and to some extent that
dominates everything else. Um okay I
wasn't planning to ask this but you just
brought up the research stuff. Can you
rank current scaling bottlenecks for
GBT6? Ah compute data algorithms power
money. Yes.
I mean which one's which one's like the
you know number one and two? Which one
are you are you like most rate limited
on? I mean look I think we are in a
world where basic research is back. I
think that is really amazing, right?
There was this period. Yeah, basic
research. Um there was a period where it
felt like all right, we got a
transformer, let's just scale it, you
know, and um I find those problems very
exciting. I have a lot of fun just like
you got a very well- definfined hard
problem. You want to just move the
number up and to the right. Um but it
also is a little intellectually
dissatisfying in some ways. It's like
that it feels like there's more to life
than just, you know, attention is all
you need paper, you know, in in in
vanilla form. Um and so I think that
what we've started to see is that we're
operating at a scale now um where we've
pushed the compute, we've pushed the
data so far that you can start to get
you start to have algorithms is like
again just back as as a important and
really almost a long pole um in in terms
of future progress. And so um all of
these things they're all they're all
important poles of the tent. And you
know on any one day uh it might look a
little lopsided one way or another. Um
but yeah, fundamentally I think it's
like you want to keep these all in
balance. Um and it's really exciting to
see things like like the RL paradigm.
That's something that we invested in
very deliberately uh for for for
multiple years. It was like when we
trained GPD4 um the very first thing
like I think it was really interesting
was when you we talked to GPD4 for the
first time we were like is this an AGI?
Like it's clearly not an AGI but it's
really hard to say why right is like
there's something about it. It's so
fluid and smooth but but somehow it
falls off the rails. is like, well, we
got to solve that reliability problem.
And you're like, well, it has never
actually experienced the world, right?
It's like someone who's just read all
the books or, you know, sort of read,
you know, sort of observed the world,
has observed the world, um, and, uh,
never experienced it itself, right? It's
like, you know, sort of just, you know,
watching it through through a pane of
glass or something. And, uh, and and
that to me is I, you know, was something
we were just like, okay, clearly we need
a different paradigm. And we just pushed
on it until we made it really work. And
I think that that remains true today
that there's other very clear missing
capabilities um that we just need to
keep pushing and we will we will get
there. Awesome. Um broadening out just
from from just broad opening eye things.
Um well honestly I'm just going to let
So we asked Jensen for one question.
He's an overachiever so he sent in two.
So let's play a second video.
AI native engineers in the audience they
are probably thinking um in the coming
years your openi will have agis and uh
they will be building domain specific
agents on top of the agis from openi and
so some of the some of the questions
that I would have on my mind would be uh
how do you think their development
workflow would change uh as uh openai's
agis become much more capable and yet
they would still have um plumbing
workflows uh pipelines that they would
create flywheels that they would create
for their domain specific uh agents
These agents would of course be able to
reason, plan, use tools, have memory,
short-term, long-term memory and um and
they'll be amazing amazing agents, but
how does it change uh the development
process in the coming years?
Yeah, I think that this is a really
fascinating question, right? I think you
can find a wide spectrum of very
strongly held opinion that is all
mutually contradictory. Um I think my
perspective is that first of all, it's
all on the table, right? Maybe we reach
a world where it's just like the AIs are
so capable um that you know we all you
know just let let them write all the
code. Maybe there's a world where that
you have like one AI in the sky. Maybe
it's that you actually have a bunch of
domain specific agents that require a
bunch of of specific work in order to
make that make it happen. I think the
evidence has really been shifting
towards this like menagery of different
models. Um and I think that's that's
actually really exciting right that
there's different inference costs just
even from a systems perspective. um that
there's different trade-offs like
distillation works so well. Um so
there's actually a lot of power to be
had by models that are actually able to
use other models. And so I think that
that that is going to open up just a ton
of opportunity because you know we're
heading to a world where the economy is
fundamentally powered by AI. We're not
there yet but you can see it right on
the horizon. They're working on it all.
Exactly. I mean that's what people in
this room are building that that is what
you are doing. And the the economy is a
very big thing. there's a lot of
diversity in it and it's also not static
right that I think when people think
about what AI can do for us um it's very
easy to only look at well what are we
doing now and how does AI slot in and
you know the percentage of human versus
AI but that's not the point right the
point is how do we get 10x more activity
10x more economic output 10x more
benefit to everyone um and I think that
the direction we're heading is one where
the models will get much more capable
there'll be much better fundamental
technology and there's just going to be
like way more things we want to do with
it and the barrier to entry will be
lower than ever. And so things like
healthcare um that you can't just you
know the the it requires responsibility
to go in and think about how to do it
right. Things like education where
there's multiple stakeholders you know
the parent the teacher the student um
each of these requires domain expertise
requires careful thought requires a lot
of work. Um and so I think that there is
going to be just like so much
opportunity for people to build. Um, and
so I'm just so excited to see everyone
in this room because that's the right
kind of energy. Thank you for
encouraging us and being an inspiration.
Thank you so much. Great everybody.
Thank you.
[Music]