NVIDIA's Artifical Intelligence Moat & Origins — With Bryan Catanzaro

Channel: Alex Kantrowitz

Published at: 2024-02-28

YouTube video id: 0hQJZh_Fzzw

Source: https://www.youtube.com/watch?v=0hQJZh_Fzzw

the Nvidia executive who started its AI
push joins us to talk about what makes
the company so indispensable unpacking
the secrets to its success that's coming
up right after this welcome to Big
technology podcast a show for
cool-headed nuance conversation of the
tech world and Beyond we have a great
conversation for you today we're going
deep inside Nvidia with Brian katanzaro
he's the vice president of Applied deep
learning research at Nvidia and he's
going to be speaking at the company's
forthcoming GTC conference March 18th to
21 in San Jose and his conversation is
going to be specifically on practical a
AI agents that reason and code at scale
so if you're in the area you're thinking
about heading out you can mark that down
let's get to the conversation Brian
welcome to the show thanks I'm glad to
be here great to have you you're kind of
the guy that kicked off this whole AI
push within
Nvidia well you know it took a whole
company to to transform Nvidia into what
it is today so um there's tens of
thousands of people that deserve a lot
of credit but um I was uh honored to be
one of the people that helped Nvidia get
started a long time
ago exactly so maybe we can get into
that in a bit but I first want to talk
to you really about what makes Nvidia so
indispensable to developers building
with AI I mean that's been everything
that's been going on recently just comes
back to the fact that Nvidia has these
capabilities right and I think that a
lot of folks hear about it they
understand it philosophically we know
that Nvidia has the technology and that
other competitors have you know have
struggled to catch up but let's just
focus on Nvidia for a moment what is
exactly happening within your company
what do you offer that allows anyone
that wants to build Ai and then run AI
models to do it
effectively Nvidia is an accelerated
Computing company and wait wait so what
does that mean accelerated Computing I
hear you guys talk about it all the time
we've been talking about it for a long
time and the world still doesn't quite
understand it so I'm glad to to try to
explain it um the idea is that the world
faces a lot of computational challenges
that can't be solved without faster
computers but uh building a computer uh
is not enough in order to actually
deliver acceleration all of the pieces
have to line up and plug together and be
fully optimized across the entire stack
so that uh people have the chance to do
things that they just couldn't do
otherwise computationally and AI is a
great example of that you know training
and deploying uh the awesome generative
models that are changing the world right
now is extraordinarily computationally
intensive it's the biggest computational
challenge the world has ever faced and
uh the reason that Nvidia is uh is is
providing something useful here is
because for for decades we've taken on
this mission of optimizing the entire
stack to build um software algorithms
libraries Frameworks compilers systems
networking chips the whole thing and
optimize them together for the most
important workloads uh and and that's AI
okay and so when I hear you recite that
you know list of different things you do
you know the the laying knowledge that
would come in with with and hear that
like if I've if I'm hearing that for the
first time I would say wait a second no
Nvidia just does the 100 chip so it's
obviously more than that so can you talk
a little bit about like people I think
you know they might get it wrong that
it's just the chip and that's the thing
that gets the gets the headlines 20,000
to $40,000 Facebook has 350,000 of them
by the end of the year that's what
people need to run AI but it's it's
actually and I think this is the
important part the ecosystem it's not
just right the the chip but everything
around it so just talk about that as you
know in layman's terms if you can like
what surrounds that chip that Nvidia
provides well um there's a there's an
entire culture about um what is
accelerated Computing we have to be very
strategic about what we're going to
accelerate U and we have to make
decisions years in advance about how
we're going to build the software and
Hardware that's going to provide this
acceleration um I always like to say
that accelerated Computing implies
decelerated Computing because it's not
actually helpful say I'm just going to
make a fast computer everybody makes a
fast computer right that's the goal of
every computer is to be fast so
acceleration really is about
specialization it's about being able to
focus and prioritize and say this is the
workload that matters most and I'm going
to optimize the entire stack for that
workload uh so in order to do that we
have an enormous amount of um Hardware
software and algorithms that that we're
working on in order to enable the
community to do things that they could
never do before sometimes we like to say
that we're building a time machine for
scientists and Engineers uh that allows
them to see into the future because of
the acceleration that comes from our
systems so it's interesting because if
you think about like a traditional chip
company and by the way I might be
totally off base on this and correct me
if I am
but what they do is take you you the
manufacturer will buy the chip put it
inside let's say their computer right
and then build the software around it
but you guys also build you not only
manufacture the chip but you build the
the algorithms and the software that
surrounds it so that enables companies
to get the most out of it
so just three questions on that one is
that a right characterization of what
you're doing um and
then well let's just start with that one
I'm not going to yeah I I think so I
mean the the core thesis that powers
Nvidia is that a chip could never be
enough you know um just just the same
way that a chip couldn't be enough for
my Apple phone for example you know
Apple makes awesome chips but the
experience of using my phone uh is a lot
more than the Chip And you know the way
that Apple's able to vertically
integrate and optimize um their entire
system in order to create an amazing
consumer experience I think is is pretty
incredible and and super valuable what
Nvidia is doing is uh not the same but
it's related in the sense that we
understand that the value of the
technology we create is only understood
in composition in context it's really
about are we delivering acceleration
transformative acceleration to the most
important computational workloads of our
time so why couldn't other companies
then just go and build their own
software to train using Nvidia chips or
other chips because it seems to me like
correct me if I'm wrong here also but if
I'm reliant on nvidia's software it's
closed Source right so I'm going to
train my model with it um it's kind of
difficult to switch to another chip so
why why particularly rely on the Nvidia
software yeah I mean we we have a lot of
open- source software as well as some
closed Source software um you know we we
make the decision about what to open
source based on what we think is going
to help the market most but I mean the
reason why people work with us is
because uh we deliver transformational
acceleration we enable people to do
things computationally that they just
couldn't do and we know that it would
never be enough to just provide a chip
that said it was really fast and had
like a lot of operations per second
inside the chip because the the gap
between um you know what a particular
chip can do and what the experience of a
science scientist or engineer trying to
invent the future uh that that Gap is
quite enormous um and uh if any one of
the links in that chain whether we're
talking about systems design or
networking or data center design or the
compilers Frameworks libraries um
applications algorithms you know if any
any of those links um are to fail the
acceleration is lost and the value
therefore is lost and so Nvidia has a
unique way of approaching this problem
uh co-optimizing the entire stack in
order to deliver that acceleration to
the end uh scientists and researchers
that are trying to the future and that's
what differentiates us from other
companies now uh could other companies
do that I mean absolutely it's it's uh
it's not a secret in fact we've been
shouting it from the rooftops for
decades that this is what we do and that
it's different from being a chip company
but you know we're we're continuing to
test that thesis you know is there value
in accelerated Computing that's above
and beyond uh what you get just from U
making and selling awesome chips and and
I think the answer to that is yes
and you know I think that's the reason
why we've been so successful right and
to me I think this is just for anyone
listening like I spent the whole week
making calls on Nvidia trying to figure
out because I was wrong I thought that
like the everything would kind of slow
down this year and I was like speaking
to customers and analyst tell me exactly
what I missed and and this was the thing
that I underestimated which is that it's
not just the chip but it's the chip the
software and everything that goes along
with it and that's why the company has
been so successful so
let's talk a little bit about like what
it actually goes into training an AI
model so do you have companies let's say
an open AI or whoever it might be that
says okay I'm ready to train a model do
they then get in touch with Nvidia and
be like this is what I'm looking to do
and then you help them figure out how
many chips they need what pieces of
software they need to to train and
everything else that comes along with
that we have um a really great set of
relationships with um institutions
around the world that are building Ai
and we're always trying to um help them
get the benefits of accelerated
Computing uh in whatever way makes sense
to them obviously um every institution
is going to have a different um
perspective and what they're trying to
do and they're going to have different
secrets that they need to keep as part
of their strategy about how they go to
go go to do that and we respect that uh
but at the same time you know we do try
to help them understand what's possible
uh with our systems and make sure that
you know they're actually getting the
the transformational acceleration that
we expect um so we do partner pretty
closely with with a lot of uh important
customers I think one of the things
that's special about Nvidia is that it
is a supporting and sustaining uh kind
of interaction when we work with our
customers um uh Nvidia technology is
integrated into the heart of many
different companies from amazing uh AI
institutions like opena I to um uh very
uh established uh companies that do
manufacturing or consumer products um or
self-driving or you know basically every
aspect of of the world's economy Nvidia
is able to um uh provide technology at
the level that um makes sense for for
companies to use uh if they'd like um us
to to provide uh just systems and they
want to write all the software um you
know they can write as much software as
they want if they want to use all of our
software uh you know that's great too um
and we're just trying to help support
and sustain uh all of the different
companies as they use AI for their own
work right and so let just let's walk
through a little bit about like how this
actually happens so let's say I'm an
organization uh I come to Nvidia say I
have a bunch of data or maybe I even
don't have data and I'm looking to build
a large language model um what do I do
now
de that's a great question you know the
the first thing that's on my mind is
like you know what data center are you
going to use to train this model in um
and uh you know that's that's a really
important question uh because it turns
out that uh the AI Market is growing
pretty fast because there's so many
institutions that are training these
huge models and you actually have to
have a building to put these machines in
and they're they're not small uh and you
need to hook it up to power you know so
that that would be one of my first
questions is like okay are are you ready
uh to stand this up or um are you going
to be working with a CSP like for
example AWS you know um and we love we
love to support our customers um through
through Cloud providers as well okay and
so then what happens next let's say I'm
set up you're set up okay so um uh we
will definitely point you to our
reference implementations of the various
uh llms and their training uh setups on
uh these clouds we'll show you um how to
scale it to you know many thousand gpus
efficiently um we'll we'll tell you what
kind of speed you should expect uh to
get while training the model and we'll
um also you know uh discuss things about
reliability um you know how do we make
sure that the the job is actually
progressing properly and yielding you
know intelligence um you know so we we
we definitely um help our customers with
things like that um I think also then
when the model's trained uh there's a
question about how do we deploy it you
know and we we'd love to um uh to to
help people deploy AI as well I think um
uh Jensen said in the earnings call this
week that somewhere around 40% of our
data center um gpus were we're going for
inference which I think is um you know
pretty amazing and and definitely a
shift from where things have been a few
years ago um and so uh we're spending a
lot more time helping our customers
accelerate the deployment of these
models as well making sure that they get
you know the best um uh uh speed uh so
that they can you know get get as much
out of the systems they're deploying
these models on as possible and what are
they using the models
for um I think that you know uh language
models are starting to be used in a lot
of different parts of uh a lot of
different companies um things like
question answering um uh I I think a
really important um to help help uh
people understand uh answers to their
their specific questions especially
relating to um private data stores that
um that they need to answer questions
with um you know I think we're we're
seeing a lot of uh people use uh Ai and
office type settings I don't know if
you've um interacted with Microsoft
co-pilot at all but um it can be really
helpful uh at least to me when I'm
looking at um a summary of a meeting and
what the the action items are for
everybody at the meeting um you know uh
another sort of office automation uh
tasks um you know we're also pushing
forward with uh the use of these models
for our own internal work at Nvidia we
have a project called chip Nemo that is
um uh using language models to help our
chip designers and and uh verification
efforts be more efficient as we build
our own um products it's called chip
Nemo yeah is is that similar I mean I
was just speaking with service now
there's ALS they're also using an
program from Nvidia called Nemo to
that's correct yeah and that's what it
is it's they use it that's like what is
that software that they use to propel
their question answering yeah Nemo is is
kind of our um uh uh most userfriendly
uh open- Source software for training
and fine-tuning language models and and
other kinds of conversational AI um it
it also has a lot of speech capabilities
as well and uh you know we've been
building it for quite a few years
because we believed that uh
conversational AI was really going to
transform uh industry we wanted uh to
make a platform for companies to build
their own build build and deploy their
own conversational AI so that's what
that's what Nemo is and uh so when we
talk about chip Nemo we're talking about
using that uh for our own own chip workk
wait yeah how so how do you use it for
your own chip
work um at the moment a lot of it has to
do with improving communication between
uh chip designers so uh you have like a
thousand people working on this project
and there's a lot of interfaces that
need to be described and um you know
people have questions they don't know uh
exactly who to talk to so basically
we're making uh uh knowledge bases about
our own work that then people uh can use
to answer questions um and we found that
that um it's kind of like having a a
more senior engineer uh that you can
talk to all the time that that helps you
uh find the things you need to find in a
in a huge code base
um and so so that's that's the primary
thing that we're doing right now is is
augmenting uh the engineers on the team
with kind of um superpowers to
understand our own code better and and
interact with it better uh over time I
expect that chip Nemo is going to uh do
other things as well um you know
improving the quality of our designs you
know our hopper gpus for example have a
lot of circuits in them that were
designed by AI that we built ourselves
uh that have better um speed um and
power and cost characteristics then uh
we knew how to build with any other tool
and generative AI uh programs designed
some of the chips yes hopper Hopper uh
is designed with generative AI that's
insane it's wild yeah so let's dream a
little bit um obviously we we know that
that like knowledge repositories inside
uh companies is something that this
stuff is going to be really good for um
maybe a little bit of like uh consumer
agents or consumer chat Bots like chat
GPT is this where it ends like where do
you see it going I don't think this is
where it ends um you know I've been
thinking recently about um past
Revolutions in the media space um you
know we we got books uh which
transformed Society you know when
because we could distribute ideas and we
could reference the same ideas in a new
way because everybody could read the
same book you know um audio you know as
soon as we audio recordings that created
an entirely new industry you know the
recorded music industry which uh
continues to be totally vibrant and
important to our culture um movies uh TV
you know um every and video games you
know every time that we come up with a
new
technology we find a way to explore
ideas as humans and explore our culture
together uh in in a way that like helps
us solve problems better and also
creates um a new form of culture that uh
that we interact with and I think that
um the most exciting applications for AI
are ones we haven't really even dreamed
up yet in in the same way that it would
be it would have been hard um to imagine
how books were going to change the world
uh back when Gutenberg first made the
Press uh you know uh I think uh uh AI is
going to create a new form of media that
is much more interesting much more
engaging much more useful um and
ultimately we're going to use that to
refine our ideas and explore them
together in the way that we have with
other other media it's just going to be
much more interesting and useful when I
hear you say media leads me to believe
that you think that this is going to be
more of like a an agent or or a digital
friend that people will start
interacting with right because that's me
media unless is is there something else
or something else I could be thinking
about I'm not thinking about that could
take the form of media yeah something
along those those lines I mean I I think
you know I'm expecting AI is going to
change the lives of all of us here on
planet Earth and when I think about how
8 billion people on this planet live you
know most of us aren't reading and
writing that much you know um but we we
do love uh Virtual Worlds people love
interacting in video games we love
interacting with each other and I think
um that the primary way that people are
going to interact with AI is going to be
in Virtual Worlds um because I think
that's going to be the most natural way
of interaction in the most useful way
and I think we're going to perceive that
as a new form of media that that really
touches you know all aspects of of our
work and our play you know it's it's
going to be uh something new so you're
you're a real believer in this metaverse
Vision that you'll just kind of end up
in a digital world and the people and
the scenery will all be AI generated or
maybe mostly AI generated and and you go
um I think that people um
we we have a culture it's very important
to us you know the the the ideas that we
share together and the sort of shared
Humanity that we have is more important
to us than uh the content of uh the the
things that we're interacting with so
for example um AI is probably going to
be really awesome at playing soccer but
do I think that people are going to go
to watch Robots play soccer um even if
they're robots are kicking around the
ball better than humans I don't think
it's as interesting because I don't
think that it is related to us you know
I think the primary thing that we're
interested in is ourselves we're trying
to understand ourselves and and how we
relate to other people I think AI is
going to give us new ways of doing that
um I think we are going to be uh
interacting in in Virtual Worlds you
know Nvidia has been uh uh a big
believer in Virtual Worlds for you know
the past 30 years it's something been on
gaming before you were on AI and and
we've had this in itive called the
Omniverse uh long before uh uh meta
renamed itself um because we believe
that uh simulating the world and
providing um you know virtual agents a
place to interact with people um is is
hugely important to the future of
technology and you know I I see these
things coming together I think um
there's a lot of opportunities to use uh
Virtual Worlds to make AI stronger to
teach AI how to understand the real
world and and act better in the real
world and then of course um uh giving
humans the opportunity to interact with
AIS in much more natural and useful ways
I think a lot of that's going to happen
in a virtual world and is that the next
place that we go with this like the
world models like we just saw meta put
something out where like it's uh
generative or it's not even generative
software but it's AI software can kind
of guess like what would happen if you
black out like a certain frame in a
video um is that is that where the next
stage of this goes I think that's going
to be really helpful um you know these
these sorts of world models um you know
I was really impressed with the uh open
AI Sora project uh this week as well I
mean really fantastic results and I'm
thinking about you know uh how these
things work together so you know the
Sora project if you read their post they
talk a lot about how building a world
model is going to help uh make
artificial intelligence more useful
because it it you know it it's going to
understand how things interact in the
real world and then it's going to be
able to use that to to make better
decisions in order to do things um so so
I think that's great um and I also think
it's uh the other way it around is also
really important that you know having a
world model then allows us to synthesize
a world which then uh allows Virtual
Worlds to be richer more interesting
more interactive and I think that's
hugely valuable and how do you train a
world model like with text I get text
right like you put the text in it spits
the text out but teaching AI a way way
to understand what the world looks like
is completely
different yeah um usually the way these
work is that there's some sort of
implicit learned representation of the
world we can call it a state space but
it's basically like um uh we're trying
like imagine if you could write down uh
every attribute of every object in the
world like where it is how how it's
moving what color it is you know if if
you if you're able to write down uh very
precisely where every object is uh then
uh you would have a good way to draw the
world right because you could you could
take that representation and then just
turn it into a picture but then you
could also use that to simulate you
could ask a question like okay if I took
um this particular action uh you know
what would the updated State space look
like you know so like for example if I
if I uh swing the baseball bat when the
ball's right there and I hit hit the
ball you know then that's going to be a
different future than if I um swing the
ball when the when the swing the that
when the ball's not there and I I miss
it right so um with a with with a model
with a world model like this you can
kind of ask those questions and and then
uh simulate uh how things go forward in
time now one of the tricky things is
that um writing down very precisely all
of the state of the world is basically
impossible you know it's uh it's way too
complex um and you know this this is for
example well known in weather
forecasting right the idea that like a
butterfly flapping its Wings in Japan
could you know magnify over the this the
course of time into like a hurricane on
the other side of the world right um
because very small um changes in the
state space of the world could actually
have pretty large uh outcomes and um so
so one of the ways that these learned
neural network models are dealing with
this is that rather than having an
explicit representation of everything in
the world uh it's all being done
implicitly so the model learns both a
function that's that's able to go from
uh some implicit representation of the
world to to drawing it and also from
that implicit representation into the
future and and then also um you know is
learning that representation from uh
from the data it's trained on directly
so it's all learned and that way we
don't have to actually try to um
describe the world to the model it it
learns how to describe the world itself
and I'm going to regret asking this
question but how does it learn that
um uh it starts to get metaphysical for
me a little bit you know uh these these
models are trained um using stochastic
gradient descent so it's um what we're
trying to do is um you know uh fit the
data that we're given as best we can by
um taking a lot of really small steps to
improve the model so um you know it's
kind of like so gradient descent is is
kind of like um walking down a mountain
the idea is that like the the fast
fastest way to walk down the mountain is
just look where you are at any moment
and find the the direction that's
pointed downward the steepest and and
take a step in that direction now you
can you can tell that this this
algorithm isn't very isn't very smart
right because if there's a canyon
depending on how big your step is you
might actually step over the canyon or
you might you might get caught you know
going back and forth when really you
should be going down the canyon and in
the spaces that we're optimizing with
you know let's say a trillion Dimensions
right right um these kinds of effects
are are are really interesting and and
and difficult to understand difficult to
to relate to but um the the thing that
makes this work is that we don't
actually have to be super precise about
how we update the model for every bit of
data it learns we just kind of do a
rough guess about like this data is
pointing the model um to fit in this
particular direction we'll take a little
step that way and then we're we're going
to do that a lot you know and and that's
what we call this stochastic part so
it's it's a little bit random um but uh
it's actually kind of a a great
Philosophy for life I think that um you
know you could spend an awful lot of
time trying to be very precise about
what direction to go to make things
better but often the right thing to do
is just make a guess take a step and
then re-evaluate what the best direction
is next and then do that a lot you know
just just be really iterative really
flexible don't be too wetted to you know
the idea that you have about where to go
at the beginning of the process just let
the process kind of guide you as you as
you uh walk through it and that's that's
the algorithms that we use um uh to
train all neural networks um you know so
the specific contents of what the
networks are learning are difficult to
interpret we don't have a lot of tools
that help us understand that uh in the
same way that we don't really understand
um how our own brains work you know we
we don't really understand all the
things that are happening inside of our
heads in order to allow us to think um
it's too complicated uh for for our our
analytics at the
moment thing is it works yeah it works
and we didn't build the brain but we did
build these systems and it's working and
we don't know how that's happening which
is which is wild okay I want to talk
about uh reasoning I want to talk about
robotics potentially and um and a few
other ways about how companies are going
to work with Nvidia and what might be
coming down the pike so we're going to
do that right after the break coming up
right after this and we're back here on
big technology podcast with Brian
katenzaro he's the vice president of
Applied deep learning research at Nvidia
uh Brian just to start off like what
made you think okay AI is going to be
big enough that I should get to Jensen
CEO of Nvidia and say we need to really
work hard to make this part of our core
offering I had been spending my um
research career at at Berkeley as a PhD
student on uh the future of computing
and um we knew that Computing was going
to have to change back in 2005 or so it
was obvious that computers would have to
be different the standard way of making
computers wasn't working anymore we
would have to be more specific we'd have
to be more parallel and so I had been
spending my time as a grad student
thinking about what kinds of
applications could take advantage of uh
the computers that will be possible to
build um but then are going to provide
enormous amounts of value to humanity
and at the time AI was not a very big
field um uh and it wasn't actually super
popular to work in it but when I was
thinking about it I felt like um from
first principles it made sense to me
that this was something that had the
potential uh to really change the world
and you know nvidia's approach to uh
solving this I think was also you know
fairly careful and iterative you know so
um you know I I published my first paper
in 2008 on um machine learning on the
the GPU uh and Nvidia uh really jumped
in full steam ahead uh for the whole
company to to become an AI company in
2013 so it took about 5 years of um sort
of testing that thesis like is AI
actually going to be something that that
could really change the world and we
started getting some early indicators of
success you know um one of those was of
course the um uh imag net competition in
2012 um which really shocked the world
with the quality of results and wasn't
wouldn't have been possible without uh
accelerated Computing you know the U
results that they got um uh were so
incredible because they built a very
fast system for training neural Nets and
uh train wasn't generative right that
was just identifying what was in PH
that's correct yeah it wasn't generative
at the time but you know um the the idea
of generative AI is is fairly old I mean
when I was a grad student generative AI
was um a thing that we talked about all
the time it's just that we weren't using
neural nets for it we were using other
models like graphical models these are
other mathematical approaches that um
are a little bit more clever um but
don't scale as well and um so this was
this is another part of the thesis that
that um I had is that you know the the
thing that's really going to help AI
succeed is um scale you know if we can
apply huge data sets and huge amounts of
compute to AI then the results are going
to get much better and this is this is
also controversial um back then and and
even today some people really don't like
this idea because uh they would like AI
progress to be mostly uh held back by
our smarts like our mathematical um
skills in in like coming up with more
clever models to describe our data in
the world um but it it does seem uh
these days that there's a lot of
evidence that the most important thing
is having really good data sets to learn
from and then enormous computational
scale and so that was my that was my
thesis and I you know I I was advocating
for that at Nvidia I wrote this little
um prototype of a library for training
neur Nets on the GPU which uh then
became CNN which was our very first um
library for uh for AI on the GPU and um
you know the process of of uh getting
the company to Rally around that and and
and build that as a product and ship it
you know it took some time um but
because there were these um you know
early indicators of success that there
was a lot of uh demand picking up um
even back then uh it it made sense for
the company to to really pay attention
and then you know
Jensen himself is such a Visionary I
remember when he first started
interacting with me about this back in
2012 I um I felt like uh he was just so
hungry to learn you know so I felt like
I I gave him all the things that I
learned from my PhD in the course of
like an hour about like um how AI could
change nvidia's business and what uh
Nvidia could potentially build and my
ambitions for what that meant were like
a thousand times smaller than Jensen's
were you know um he he took it
immediately and then elaborated on it
and thought about where is this going um
you know one of the things he first said
back in 2012 was um uh this is an
entirely new way of writing software
rather than having humans enumerate all
the different cases that software needs
to understand we're going to have models
that learn from from our data how to
solve problems and um these days that
sounds like the truth right like we we
see that happening every day when we
interact with these models but you know
uh 12 years ago uh that was a pretty
bold thing to say and I was a little bit
nervous about it because um you know the
history of AI uh over the past you know
70 years had been one of um over
promising and underd delivering in a lot
of ways which then a lot booms
yeah a lot of like think it can do
something and then just totally dry it
up until it started to prove itself
again and so when when Jensen like
immediately glommed onto this and
started like um uh thinking about what
it what it could mean I wanted to slow
him down a little bit and be like Jensen
like this is a this is a big huge idea
but like I'm not sure if it's going to
happen now it might be 30 years from now
you know um but uh it turns out that
Jensen was right about this uh this was
the right time um to apply enormous data
and enormous compute to Ai and get these
results right but 2012 wasn't I mean it
took another 10 years 11 years really
for the boom to come so what did it feel
like yeah go ahead oh I was gonna say I
think Nvidia is uh really good at decade
long technology Development I've seen
that happen at Nvidia many times you
know uh Ray tracing I was in meetings in
2008 with Jensen on raay tracing and we
launched our first raate tracing GPU in
2018 you know it took 10 years of
continuous development and research in
order to make uh Ray Trac Virtual Worlds
a reality and uh Cuda itself you know uh
the the projects that led to Cuda uh
they started in the early 2000s Cuda was
released as a beta in 2006 this the
software that all AI programming is done
with on with the h100s pretty much na
100s yeah C Cuda is our our our
framework for programming the GPU and
making it do stuff that's interesting
and um and uh you know that that project
um was crazy for a long time you know
Wall Street hated it because it
subtracted value from our earnings
reports like they looked at the costs of
our products and they're like these
products are too expensive your margins
are too low you know back then the
margins were quite low and that's
because um you know there wasn't the
applications and the ecosystem that were
using Cuda yet in order for us to uh you
know build a strong business around it
but uh Nvidia continued investing in
Cuda in uh in the the libraries the
software the compilers the Frameworks
and of course also the chips uh for 10
years you know actually maybe more than
10 years before uh all of a sudden Cuda
became an overnight success you know
it's like 10 years of hard work that
everyone ignored and Wall Street
criticized Nvidia for mercilessly why
are you wasting your time on this you
know everybody knows the GPU is just for
gamers why are you trying to make the
GPU do something else and you know we
did it anyway and you know that's one of
the things I love about uh this company
I think it's one of the reasons why
we're successful at The Accelerated
Computing mission is that when we decide
to do something we do it out of our
convictions about how technology will
unfold and we base those convictions on
a speed of light analysis about what's
what's actually possible um to try to to
keep ourselves honest and then you know
when once we have that conviction we're
able to follow through what did you see
in those years that everybody gave up on
this I mean obviously there were big
advances that were made in things like
machine learning right that computer
vision and natural language processing
and that's where we had Facebook really
take the lead as the public spokesperson
for this stuff talking about uh image
recognition and they even built this um
fake generative chatbot called M that I
had access to that basically would be
like it's supposed to be a large
language model we didn't even know it
was going to
LGE language model is a pre- Transformer
right but like you would talk to this
bot and it would talk back and they were
trying to figure out like what people
were interested in if they're going to
build a bot and they had this whole bot
platform that came out but overall like
everyone's telling you yeah this is not
worth building I mean it's maybe just
one or two companies that are using it
so why did you still think that I mean I
guess it's hard to predict what happened
next but why did you believe that that
was going to happen Nvidia really thinks
about about these problems from first
principles you know we know that um the
way that computers are built is changing
we know that um because of you know Mo's
law is is slowing down that requires
more
specialization um we know that uh
there's a lot of opportunities to really
provide
transformational uh speedups to
important workloads if we specialize the
systems and the software for them and uh
we felt like what is more important than
this you know what's more important than
intelligence and does the world need
more intelligence absolutely the world
needs enormous amounts of intelligence
like the problems that we face um as a
planet uh I think we're going to need a
lot of intelligence to work through them
and so uh for us it was I think just
kind of an obvious thing to do um we had
a lot of conviction we we understood the
technology we also saw early indicators
of success from a lot of different um
directions you know a lot of different
companies a lot of research institutions
that um were talking to us and saying
hey we um we have these goals to like
train this huge language model like on
enormous amounts of text but you know
the the current systems are just too
slow and um you know there's this idea
you know back 10 years ago there was
this idea that unsupervised learning was
going to change the world but nobody
knew how you know unsupervised learning
meaning that rather than having humans
go in and label every picture is it a
cat is it a dog that's super wise
learning we're just going to show the
model all the pictures that we can find
and the model is going to learn itself
something about pictures that then we
can use to solve problems you know that
idea has been around for decades um but
actually turning it into something that
worked uh you know that's only happened
over the past 10 years and I think it's
only happened because of um you know the
increases in scale that we've been able
to bring to the problem so during that
10 years you know we saw continuous
Improvement um even if the rest of the
world uh didn't see it I think um one of
the one of the things about technology
when it's growing on a an exponential
curve is that the the beginning of it
feels like nothing's happening outside
you know so e exponential curves that
the hockey stick kind of curve it looks
like nothing nothing nothing all of a
sudden huge success you know that's
that's kind of what the exponential
curve looks like but the interesting
thing about an exponential curve is that
the rate of progress is Con
you know it's always getting you know
let's say 10% better like every every
year it's just 10% better right and so
so you can tell that you can see like
wow this technology it's continuing to
improve even if uh it's not reached the
point where it's useful for for the
world yet uh we we just have this
confidence that it would so people had
this basically large you know swas of
text and they like we want to build
something like a large language model
but it just wasn't available yet um did
you you guys notice when uh in 2017 the
paper attention is all you need comes
out from Google which is like the basic
so what was the reaction internally
because even within Google I'll say this
I've spoken with people within Google it
was not a yawn but it was like ah okay
not a like holy crap moment but I'm
curious what happened within Nvidia cuz
it's sort of you know your bread and
butter absolutely and that that paper
caught our attention immediately because
of the implications for our entire
business so you know I I told you
earlier accelerated Computing is not
about the chips um and this is a great
example of that like if we built a
system that is for let's say resnet 50
which in 2017 that was the most most
widely you know uh talked about kind of
neural network is these image
classification networks if if we built
systems to accelerate that that would be
a really different kind of system than a
system designed to accelerate
Transformers and so we have to ask as
this question you know what's going to
be the future what's going to drive uh
demand what how are we going to build um
the right technology to accelerate the
things that will matter a few years from
now and so of course we're always asking
ourselves that question you know is
there something coming along that's
going to um uh change the way that
people build Ai and if there is then we
need to think about what are the
implications for the systems that we're
building um so yeah we saw that paper um
I have to say that the title is a little
bit like maybe of a a pill to swallow
because you know attention is all you
need it's like but is it you know like
it it kind of It kind of elicits that
reaction from a lot of people but um the
thing that was really attractive about
um Transformers to us was that we knew
that they had really favorable
computational properties and um again
going back to this thesis that the model
is a little bit is the model is less
important than the data and the compute
that goes into training the model if you
have a a model that has really excellent
um uh compute properties that allows you
to scale uh really well efficiently to
you know many thousand of gpus the kinds
of results you can get from that um are
pretty pretty spectacular so we we saw
that early on that's what the
Transformer model did that's what this
paper attention is all you need sort of
architected absolutely and and so we saw
that it had the potential to do that and
so we were very curious about it and you
know um uh in my team we had our own
language models team back in
2017 and at the time we were using
recurrent neural networks which were the
the standard way of doing things before
the Transformer paper came out and um
and so I I asked an intern uh hey can
you uh take a look at doing language
modeling with Transformers I'm hearing
good things about it uh it would be
great for us to have an independent
perspective on whether this is a good
idea and he came back uh you know a
month or two later with just really
astonish in results you know it was
there was no question that it was better
uh than the models that we were using
and also that it was more scalable so we
were able to to train bigger and smarter
models because of that scalability and
so um so that was really important um
for for us and then you know the whole
company kind of paid attention to the
way that Transformers were changing Ai
and and then started you know Building
Systems to help uh make that even better
should Google have open sourced it I
mean they haven't gotten the most value
out of it you know others have gotten
more value out of that
paper I can't really speculate on uh
Google's business or or you know whether
they should or shouldn't have done
things I think um
uh if Google had not open- sourced that
or had not uh published that paper um uh
but if we started seeing like incredible
language modeling results um uh we would
have figured out some sort of a model
that had good scalable properties that
um uh that could help with this um space
and you know there's not just one
transform variation I think ultimately
the community would have figured
something out because it's so important
you know y I think Google deserves a ton
of credit for doing that work first and
for publishing that paper so you you
basically build the um you know you're
building you know for this world of AI
you see the Transformer Model come out
you shift you you incorporate it you
start to see the gpts from open AI is
that the next big moment on this journey
where you're like oh this could be this
could be because it was interesting
speaking again about like what people
saw from the outside we all knew that
Tech you know opening eye was doing text
generation but it didn't really click
for most people until it became a
chatbot so what did it look like for you
when it was just like you know you've
been watching this the whole time what
did it look like on your end yeah well
you know I'd been watching open ai's
work in language modeling uh since
before GPT I don't know if you recall
they had this sentiment neuron project
um which I thought was really cool
because it was doing
unsupervised uh modeling of text and
then they were able to find that um just
by showing the model a lot of text that
all of a sudden the model had started to
understand high level Concepts about
texts like for example what kind of
emotion is being expressed inside of
this text and that was a really
interesting thing because like I said
unsupervised learning the idea had been
around for a long time that um we would
make a lot more progress as a field if
we were able to do unsupervised learning
but um actually figuring out how to
practically get some value out of just
showing a lot of data to a model uh it
wasn't very obvious to everybody and uh
so when I saw that um that unsupervised
um sentiment neuron project from openai
I thought that was really interesting
and they followed that up with uh the G
the first GPT paper um which kind of
Applied Transformers to this and in the
process you know made a much better um
sort of text analytics model G the first
gpt1 you know it was it was really kind
of using a generative model more for
classification than for Generation it
was more like you know can we use a
generative uh pre-trained model to
understand text rather than can we use
it to create text because at the time uh
creating text seemed too hard and then
of course um you know gpt2 came out and
had uh really astonishing text
generation capabilities and not just
that but also already had started to
learn things about the world that um
were very difficult to teach any AI
system before remember they have this
story about unicorns and South America
being studied by some University
professor and and the that the model
could remember that like in South
America people speak Spanish and you
know there's a country in South America
called Peru and like there's mountains
in that country and you know it's like
wow the the amount of facts that this
model is able to recall after only being
trained on enormous amount of text it
was really shocking right what do you
think when people say that it's just
these models just predict the next word
don't get too excited about it I mean
the what the what you're describing it
seems like something
more yeah I mean it it's always possible
to to get very reductive with systems I
mean you could say that um I'm just meat
right I'm just I'm just a monkey made of
meat and like you know everything that's
happening in my head is also Just Energy
minimization like there's chemistry
happening in my head it's equivalent
tell me that yeah love is not love it's
just a chemical I think you're totally
right yes it's a chemical but also there
might there's something more here so
you're saying with LMS yeah I mean I
think I mean so so the fact that
chemistry is involved uh in our own
Consciousness doesn't make our
Consciousness less interesting to me the
fact that like you know Lal networks are
trained to predict the next word and and
you know and that may not be like the
ultimate Way of training them you know
we're learning how to do this right so
maybe maybe we'll come up with a better
way tomorrow I'm not attached to that
particular way but I also don't think
that understanding a little bit of how
something works takes away from the
magic right okay so we have just a few
minutes left I want to ask you a couple
more questions uh chat GPT when that
comes out I mean obviously you had
already been pretty impressed by GPT 1
and two we're already at 3 three and a
half right by the time chat GPT comes
out in November 2022 and then this stuff
explodes your reaction like what what
was it like sitting where you were it
was just extraordinary I mean the amount
of change that chat GPT brought to the
world uh incredible I didn't I thought
it was kind of cheeky of open AI to
release it at the same time as the nurs
conference because um you know usually
the AI world is entirely focused on like
the cool papers that are coming out at
the conference but instead the entire
world was focused on this chatbot you
know that was doing things that you know
no one had ever seen a chatbot do before
and uh uh you know to me that was a
statement that we were entering a new
era of AI where applied research um uh
starts to dominate you know so chat PT
didn't come out with a fully-fledged
academic paper that described exactly
what they did to make it so awesome um
but because it the results were so
strong it kind of dominated the the
academic discussion and um I felt like
that that was really interesting um in
terms of a a water watermark a watershed
moment for um uh sort of the maturity of
the AI industry you know that that it
was now possible um to create systems
that would solve problems in ways that
we'd never seen before um uh if we we
applied some really good engineering and
and applied research to it um and so
that um you know definitely definitely
changed the world and and since then you
know uh my world has been just
continuously on fire you know every day
I open my email there's an new awesome
result it's really exciting times and
working at Nvidia one of my favorite
things about working at Nvidia is that
we get to um collaborate with people
from all sorts of companies and
institutions and we get to sort of
rejoice in the good work that's
happening around the industry because um
at the at the end of the day you know
it's it's really exciting to see AI
flourish that's our mission is is to
make AI flourish everywhere and um and
so so when I open my email and see all
these great results uh it it always
makes me happy do you think that we're
going to get to artificial intelligence
that's on par or greater than human
level
intelligence I don't really like that
question because I don't know what human
intelligence really is um for example I
think that cardi B is extremely
intelligent um she is able to capture
the attention of hundreds of millions of
people by doing things that I'm not
exactly sure why they're so interesting
but they totally are right there's a lot
of people that would love to do that but
don't have the kind of intelligence that
she does in order to to make that work
what is cardi B's SAT score I have no
idea it's not very interesting to me oh
yeah there's books Mars and emotional
spars and other forms of Brilliance
there's there's eight billion forms of
Brilliance on this planet this the thing
though these models are getting good at
everything right making they're making
music they're writing books they're
making videos so there's a world there
where you could say it can approximate
there's a chance I mean getting just to
the Baseline of human intelligence is
one thing but there's a chance that this
stuff can maybe even exceed some of our
most talented people all spectrums well
it you know um AI has been smarter than
humans at at many things for a long time
I mean when I was in high school deep
blue beat Gary Casper off at chess right
did that mean that humans stopped
playing chess no actually it changed the
way humans played chess it made humans
play chess better because humans had new
tools to learn they had AI to help them
learn how to play chess and the reason
we play chess isn't to win you know we
play chess because it's part of our
culture because it's interesting because
we like the challenge because we like
the interaction you know that because
it's it's what we're doing as humans is
exploring you know what what does it
mean to exist um I don't think that AI
challenges that I you know I've I've
been in a lot of rooms with a lot of
smart people I don't think that it's
necessary for me to be the smartest
person in order to have value or to um
you know be interested or engaged in
something that's going on want to be the
smartest person in the room because I'm
not learning that way right exactly so
I'm not I'm not threatened by this um AI
so my thesis is AI has always been
smarter than us at some things the
number of things that it's getting
better better at us is is getting larger
but that doesn't threaten me um I'm not
worried about being obsolete in the same
way that I don't think an oak tree is
obsolete what does it mean for a tree to
be obsolete like how do you measure the
worth of a tree like are we going to
just talk about how tall it is or like
you know how many leaves it has and
count them and say well this tree is
worth more cuz it has more leaves and it
made that other one it's just not a very
interesting question to me well it's so
interesting that you're going straight
to obsolesence where like some might say
this is actually you know if we if AI
equals human intelligence it's not a bad
thing like maybe there's actually like
yeah yeah it becomes a tool for us I I
think it is a tool for us um but it is
interesting it's the way that it's
portrayed will take these conversations
often to the obsolescence part I don't I
don't really fear that either I don't
either you know one person that I I
really love his thoughts on this is um
Jurgen Schmid Huber and he has said
multiple times that a truly intelligent
AI is going to be first of all um not
very interested in living on the surface
of planet Earth because it can beam
itself over the radio at the speed of
light anywhere um and uh it can live
underground in fact underground in
different places is better there's more
resources outside of the crust of planet
Earth where we live and so um I I think
that um you know we don't we don't have
a lot to fear I think the the scariest
thing for me is um you know are we are
we going to um you know not figure out
how to use this technology um because I
think we desperately need it I think our
world desperately needs more
intelligence and and so that's our
mission yeah I've been emailing with
Jurgen trying to get him on the show so
you're reminding me a to follow up here
I don't know maybe you can help help me
put a good
word Brian great speaking with you
thanks so much for joining great great
to have you on the show Alex all right
everybody thanks so much for listening
and we'll see you next time on big
technology
podcast