AWS CEO Matt Garman on Amazon's Big AI Chips Bet, Working With OpenAI, and Nuclear Energy

Channel: Alex Kantrowitz

Published at: 2024-12-05

YouTube video id: FDUDVvLbt9E

Source: https://www.youtube.com/watch?v=FDUDVvLbt9E

Welcome to Big Technology Podcast, a
show for coolheaded nuance conversation
of the tech world and beyond. We are
here in Las Vegas, Nevada at Amazon's
Reinvent Conference with the CEO of AWS,
Matt Garmin. Matt, so great to see you.
Welcome to the show.
Yeah, thank you for having me.
Great to be here. Great to be here with
you at your event. Let's talk a little
bit about infrastructure. You're the
kings of building data centers, right?
There's no one that does it better than
AWS,
but there are headlines in the AI world.
Elon Musk took 122 days to build a
100,000 plus GPU data center to train
Colossus, which is his latest AI model.
Does this show that scaling data centers
is now core to competing in AI? Is it
validation? What do you think about it?
Well, look, we've been building data
centers for for almost two decades now.
And so uh this is something that we
spend a lot of time on and um it's less
that we were out there kind of bragging
to the press about but what we do is we
you can brag what's your size of yours?
Well, more more is uh what we do is
provide infinite scale for customers.
And so our our goal is for largely the
customers not have to think about these
things, right? And so we want them
across their compute, across their
storage, across their databases um to be
able to score to able to scale to any
number of of size off and so take
something like S3 as an example. It's an
incredibly complex um very detailed
system that uh keeps your data, keeps it
durable and scales infinitely, right?
And customers largely just put data in
there, don't have to think about it. And
so um today S3 actually stores 400
trillion objects which is an enormous
number that's hard to even get your head
around but it's just something where we
just keep scaling and we keep growing
for our customers. As you think about AI
now now these are power hungry massive
data centers for sure right and um and
AWS is adding tons and tons of compute
all the time for our customers. Largely
what we think of though is less about
you know how fast can you build one
particular cluster that the absolute
size of AWS is dwarfed by any other
particular cluster out there but we're
focused on how do we deliver the compute
the customers need to go build their
applications take somebody like uh
anthropic as an example uh anthropic has
the the what are widely considered to be
the most powerful AI models out there
today in their claw set of models we're
building together with them what we call
project rainer uh and so it's using our
next generation Tranium 2 chips and this
uh cluster that we're building for them
uh in 2025 will be five times the size
the number of exoflops that they use to
train the current generation of models
which are by far the most powerful ones
out there and it's going to be five
times that size next year all built on
tranium 2 delivering hundreds of
thousands of chips in a single cluster
for them so they can train the next
generation that's the type of thing
where we work with customers understand
what's interesting for them and then
help them scale to whatever uh level
they need and that's just one of our
customers, of course, we have hundreds
and hundreds of other customers as well.
Here's my point. You're so good at this,
right? Look at what you just talked
about in terms of anthropic being able
to help them scale the way that you are.
And that would lead me to believe that
Amazon would have its own cutting edge
state-of-the-art model. One that would
lead, you know, and be better than the
open AIS and the anthropics. This is
your core competency and this is what
makes these models run. So why hasn't
that happened?
Our our core competency is about
delivering compute power for all of the
people that need it. And you know for a
for a long time um we've been very
focused on how do we build the
capabilities to let our customers build
whatever they want and sometimes um uh
there are areas that Amazon also builds
and other times they're not areas that
Amazon builds and so you think about
whether it's in the database world or
you think about in the storage world or
you think about the data analytics world
or you think about the ML world we build
this underlying compute platform that
everybody can go build upon and
sometimes we build services that compete
with others uh out there in the market.
Think about uh Redshift competing with
Snowflake who's also a very important
partner of ours and a big customer of
ours, somebody that we do a lot of
partnering together on. And then there's
other times where there's applications
that people build on top of AWS that
Amazon doesn't go and build. And so we
we uh uh operate across that whole swath
of area and um and sometimes we'll build
and sometimes they don't. But that's the
the kind of the beauty of AWS is that
our goal is to build that infrastructure
so that sometimes we can build those,
sometimes we won't build them. But we
want this platform that everybody can go
build the broadest set of applications
possible out there.
But I'm thinking about it for AI
specifically and in the world that you
play in. You have Google, they have
their own model, they sell cloud
services. You have Microsoft, okay, they
don't have their own models necessarily,
but they have this deal with OpenAI.
Pretty sure that Open AI is exclusive on
Azure. Now this is where a lot of the
growth is coming from and so
and I think it's a mistake actually. So
the interesting thing there is and and
and this is where a lot of people
started and I just think it's
fundamentally the wrong way of thinking
about it. Just a lot of times people are
thinking about there's just going to be
this one model and I want to have the
one model that's going to be the most
powerful and the one model to rule them
all. And as you've transitioned as
you've seen over the last year there
isn't one model that's the best at
everything. Like there's models that are
really good at reasoning. There are
models that are great for um that
provide open weight so that people can
bring their own data and fine-tune them
and distill them and create things that
and create kind of completely new models
from that that are completely custom for
customers. And on that you may want to
use a llama model or you may want to use
a mistral model. There's customers who
really want to build the the world's
best images and they might use something
like a stability or they might use
something like a Titan model. There's
customers that need really complex
reasoning and they might use an
anthropic model. um uh there's a whole
ton of these operated out there and and
our goal is how do we help customers use
the very best. It doesn't have to be one
thing. It's not just one. And we don't
think that there's one best database. We
don't think there's one best compute
platform uh or or processor. Uh we don't
think that there's one best model. Uh
it's across that whole set. And and
that's been our strategy and customers
have really embraced that strategy as
you as you get to them and they're
thinking about how they go build in uh
uh production applications. They want
the stability, the operational
excellence, and the security that they
get with AWS. But they also want that
choice. It's incredibly important for
them. And I think choice is important
for customers. No matter if they're
building generative AI applications, no
matter if they're picking a database
offering, no matter if they're picking a
compute platform, they want that choice.
And that is something that AWS has from
the very earliest days really leaned
into. And I think it's an important part
of our strategy. And it's maybe not the
strategy that others have. Maybe others
say it's just this one and this is the
one that we're going to lean into. But
it's not the strategy that we've picked,
our choice is around choice. And it's
part of why we have the broadest set of
partner ecosystem as well. It's why as
you walk the halls here on reinvent.
It's filled with partners who are
building their business on top of AWS.
We're leaning in and helping our joint
customers accelerate their journeys to
the cloud, their modernization efforts,
their AI efforts. It's because of that I
think that that is a lot of what makes
AWS special.
Okay. Okay. And I'm going to move off
this in a moment, but I the reason why
I'm asking these questions is because
you do have at least a bet that big
foundational models are going to matter.
That's the 4 billion you just invested
in anthropic. And I think that the
strategy that AWS has makes a lot of
sense, right? This bedrock strategy,
there's a lot of different models in
there. People have their data in the
cloud. They're going to build with uh
their data they have within AWS using
bedrock picking models.
But you also are limited in the fact
that OpenAI is not there. I don't think
Google's there. So wouldn't it make
sense in parallel to the bring your own
model strategy to also use this capacity
that you have to scale infrastructure to
get in the game yourself?
Look, what I'll say is never say never,
right? I think that there's it's an
interesting idea and then we'll we never
close any doors. I think we're always
open to frankly a whole host of things.
We're always open to to having open AI
be available in in AWS someday or having
Gemini models be available in AWS
someday. And and maybe someday we will
spend more time focused on our own
models for sure. I think I, you know, I
think all of that is open and um part of
what I think makes AWS special is we're
always open to, you know, take our
announcement um earlier this year about
partnering deeply with Oracle about
making Oracle databases available in
AWS. Lots of people would said, "Oh,
that's never going to happen." And it's
against your strategy. Our strategy is
to embrace all technologies because we
want anything that customers can use, we
want them to be available and to be able
to use it inside of AWS. And look,
sometimes it happens today, sometimes it
happens tomorrow, sometimes it happens
weeks from now, months from now, years
from now. Um but but that is our goal is
to make all of those technologies
available for our customers.
Okay. I'm going to parse your language a
little bit because you said that you're
always you're you might be open to
having open AI on uh Bedrock within AWS.
Uh are you talking to them? Would you
want to ask them to come?
There's there's nothing to announce
there today, but I'm saying if customers
want that, that's something that we
would want and we'd love to make it
happen at some point.
Okay.
Yeah.
Well, maybe they're listening and they
want to make that move.
Yeah. But let's speak to the one that I
think is the biggest challenger to them,
the one that you have all this money in,
which is Anthropic.
Yeah.
So, what does the 4 billion that you
just invested in Anthropic get you?
Yeah.
And how does that make you
differentiated from other cloud
providers?
Well, I have a couple things I would
say. One is um you know, we we we make
the investments uh in Anthropic because
we think it's a it's a good bet. They
have a very good team. uh they they have
um they've made some incredible traction
in the market and um we really like what
they're where they're innovating and so
we you know we thought that's a good
investment
and we're predominantly clatheads on the
show it's a fantastic product right and
and Dario team are are very good and and
they continue to actually attract some
of the best talent out there in the
market today um you know the other thing
that we get from that is a deep
collaboration on trrenium and uh you
know we're we've made a big bet on
trrenium as a an addition option for
customers. You know, the vast majority
of
We should define it. It's the chips that
Oh, sorry. Go ahead. I mean, I'll just
say the chips that people can that
companies can use to train their own
models with. That's right.
At AWS.
That's right. And so today, the vast
majority of of uh AI processing, whether
it's inference or training, is done on
NVIDIA GPUs. And um we're a huge partner
of Nvidia. We will be for a really long
time. And I think that um and they make
fantastic products, by the way, and they
continue to do that. And when when black
hole chips come out, I think people are
very excited about that next generation
platform. But we also think that
customers want choice and we've seen
that time and time again. We've done a
general purpose processors. We have our
own custom general purpose processor
called Graviton.
And so we actually went and built our
own AI chips um uh the first version was
called or they're called Tranium and we
launched Trrenium 1 uh a couple years
ago in 2002 and um are just yanging
tranium 2 here at Rainvent.
So that's news that's happening this
week.
We announced that uh in my keynote.
Yes.
Um This is filming. You may have already
had it.
We're going to release the transcript as
your keynote hits and the podcast the
day later.
Great. But this is brand new news. Fresh
off the press, folks.
Fresh off the press. Um, and so we'll
have Trrenium 2 and Tranium 2 gives
really differentiated performance. Um,
we see 30 to 40% price performance gains
versus our instances that are GPU
powered today. So, we're very excited
about Trrenium 2 and customers are
really excited about that. And what
Anthropic gives us back to your question
is a leading frontier model provider
that can really work deeply to build the
very largest clusters that have ever
been built um with this new technology
where we can learn from them, right? And
just learn what's working, what's not,
what are the things you need accelerated
so that training three and training four
and training five and training six can
all get better as as we continue to go
and uh and the software associated with
GPUs gets better or the accelerators
gets better as well. I think that's one
of the things where people who have
tried to build accelerated platforms
before have fallen down is the software
support has not been as good as Nvidia's
software support is fantastic. Um uh and
so that's a big area where they're
helping us as well as we help iron out
the creeks and the kinks and try to
figure out how we make sure that
developers can start to use these
tranium 2 chips in a very seamless way
and a high performance way. So we learn
a lot from them as big users. that are
really leaning in and help us learn and
they get benefits from they get that
scale and um and cost benefit of running
on this price performance platform that
gives them a huge win. Uh and um and we
think then uh from that investment we
can both benefit as they deliver better
and better models over time.
There's an interesting thing that
happens when I speak with people who are
working in cloud or working to train
models are working to build their own
chips. There's always a preface we love
working with Nvidia and we're also
building chips that compete with what
they do. So, how does that relationship
work out? They don't get upset that
you're trying to build the same I mean,
they have a supply issue, but how does
it work with them?
Oh, uh, no, I have a great relationship
with Nvidia and and Jensen and, um, uh,
and and this is a thing that we've done
before. Um, we have a fantastic
relationship with Intel and AMD and we
produce our own general purpose
processors and and it's big world out
there and there's a lot of market for
and and uh, for lots of different use
cases and it's not one is going to be
the winner, right? There's going to be
use cases where people are going to want
to use GPUs. Um, and there's going to be
use cases where people are going to find
tranium to be the best case. There are
use cases where people find that um, our
Intel instances are the best choice for
them. There are ones where they find
that AMD instances are the best choice
for them. And there's increasingly a
large set where they find Graviton,
which is our purpose-built general
purpose processor, is the right fit for
them. And it doesn't mean that we don't
have great relationships with Intel and
Nvidia or Intel and AMD. And that means
we'll continue to have a great
relationship with Nvidia because for
them and for us it's incredibly
important for Nvidia processors and and
and uh and GPU powered processors to
perform great on AWS. And so we are
doubling down our investment to make
sure that Nvidia performs outstanding in
AWS. We want it to be the best place for
people to run GPU based workloads and um
I expect it will continue to be for a
really long time.
What's the buying process like with
Nvidia? Because you want uh as many
chips as you can get. I would imagine.
You have Elon who buys them by the
truckload. You have Zuckerberg who has
been buying lots and I think he wants to
power them with a nuclear submarine or
something like that. So, do you have to
jostle with the other companies to get
Nvidia chips or do you get every exact
quantity you want?
Nvidia is very fair about how they they
go about and uh um I mean you can ask
them about how they internally allocate.
That's not really a question for me.
It's for them. But um but they they're
they're they're very fair in dealing
with us and we give long-term forecasts
and uh and they tell us what they can
supply and and we all know that there's
been shortages um uh in the last couple
of years specifically as as demand is
really ramped up. Um and and they've
been great about ensuring that we get um
uh you know enough to support our joint
customers as much as possible.
What about your inference chips in
Frenchia?
Yeah. Um because last time I heard you
speak, you said that
the activity within AI right now, Gen
AI, is 50% training, 50% inference. Does
that ratio still hold? And how are you
going to put the chips out there to
allow companies to be able to do cheaper
inference? Because that's the issue with
generative AI. It works well, but it's
so expensive that companies take proof
of concepts and only 1/5if actually make
them out into production.
Yeah, it's it's absolutely the case. And
I think we're, you know, we're still
probably seeing about that ratio of
50/50. I think if more and more it's
more inference to than than training and
increasingly we'll see more and more of
the workload shift that way. Um, it cost
is a super important factor that many of
our customers are are uh are definitely
worried about and thinking about on a on
a daily basis. And you know, if you
think about where a lot of people were,
they went and did a bunch of these geni
capabil or um tests, right? where they
did proof of concepts and they launched
hundreds of proof of concepts across the
enterprise without really paying
attention to like what was the value
going to be or anything like that and
now they're looking at them and they're
saying well the ROI is not really there
they're not really integrated in my
production environment they're just kind
of these PC's I'm not getting a lot of
value out of and they're expensive as
you mentioned so two things that people
are thinking about is one how do I lower
the cost so that I make that um the cost
much lower to run and that's your point
about cost of inference and two how do I
actually get more value out of that so
the ROI equation just completely shifts
it makes more sense. And it turns out
it's probably not all hundred of those.
It's probably two or three or five of
those that are are really valuable. Um
uh and so there's a couple things.
Number one is on the cost side. Um as uh
there's a few things that we're doing to
help people lower costs. Number one is I
think trrenium 2 will be a material
impact there. And um as these models
have gotten bigger and bigger, you
mentioned inferentia.
Originally we had a small chip called
inferentia that would run really fast
lightweight inference. Now, as you're
running models that have billions, tens
of billions, hundreds of billions,
trillions of parameters, they're way too
big to fit on these small inference
chips. And effectively, they're running
on the same training chips. Like,
they're all the exact same things. And
so, you run inference today on H100s,
H200s, or you run inference today on
training 2s or training ones. And so,
um, we may come out over time with other
inference, uh, inferential chips, as you
will, but but they're really using a lot
of that same architecture and they're
still really large servers. And so we
actually expect that Trannium 2 is going
to be a fantastic inference platform.
Our naming is not necessarily always our
suit as to what these trips are for. But
um it's going to be a fantastic
inference platform. We actually think
it'll be as you think about that 30 to
40% price performance benefit the
customers are going to get. Now if you
can run inference at 30 to 40% cheaper
compared to the the the leading um GPU
based platforms, that's a pretty big
price decrease. And then there's a
couple there's also announced at
reinvent. We're launching automated
model distillation inside of Bedrock.
And what that makes lets you do is you
can take one of these really large
models that's really good at answering
questions. You can feed it all your
prompts and things for the specific use
case you're going to want and it'll
automatically tune a smaller model based
on those outputs and kind of teach a
smaller model to be an expert only in
the area that you want with regards to
reasoning and answering. So you can get
these smaller cheaper models uh say like
a llama 8B model as opposed to a llama
405b model. cheaper to run, faster to
run, and you can still treat uh get it
to be an expert at the narrow use case
that you want it to be. And so that
combined with a cheaper infrastructure,
we think is one of the things that is
really going to help people um lower
their costs and uh and be able to do
more and more inference in production.
Yeah, those small models seem to be the
cost solution. Sounds like you're a
believer.
That's right.
Absolutely.
Um one more question about Nvidia.
You've tested the new Blackwell chip. Is
it the real deal?
uh we have you know they're they're
they're working on getting the yields up
and getting it into production but um
we're excited about that and uh and also
reinvent we're going to announce that
the P6 which is the the Blackwell based
instance that's um coming early next
year and we're excited about that. I
think customers I think we're expecting
um about two and a half times the
compute performance um out of a
Blackwell chip that you get out of a um
an H100 and so that's a that's a pretty
big win for for customers.
So you're in with Jensen's the more you
spend the more you save. That's right.
That's you know that's that they they've
that team has executed quite well and
they've uh they continue to deliver um
uh huge improvements in in performance
and um and we're happy to make those
available for customers.
Okay. Should we talk about ROI?
Sure.
All right. 2-year anniversary of Chat
GPT.
Um all these companies have rushed to
put Generative AI in their products.
Yeah.
To this point, there's a couple of
things that I've heard that have worked
well.
Yeah.
AI for coding. Mhm.
Um,
AI that is a customer service chatbot
with a little more juice.
Yep.
AI that can read unstructured documents
and make a little sense of them.
Yep.
Those are the three big ones. I haven't
heard much more outside of that.
Yeah.
We're talking about something that's
added trillions of dollars potentially
to public company market caps. Something
that has had the largest uh VC funding
round and then probably the subsequent
three after that.
Y
is are the three examples that I listed
enough to make this worth the money? No,
definitely not. Uh, but they're are
super valuable right now and they're
just the tip of the iceberg. And that's
the thing is like you just have to look
at the rest of the iceberg to to realize
how we going the opportunity is. And and
on those three, look, I think those are
actually massive opportunities by
themselves. We we have a number of
announcements here at Ranvent around Q
developer and making developers and
their whole life cycle um more valuable.
You think about the first generation
just using this as an example. The first
generation of even developers was just
code suggestions, right? In like code
suggestion super valuable actually. It
made developers much more efficient
reading all the code. Um, it turns out
also developers on average code about 1
hour a day. The rest of their day is
spent doing documentation. It's spent
doing writing unit tests. It's spent
writing doing code reviews. It's spent
doing, you know, going to meetings. It's
spent doing uh upgrading existing
applications, doing all that stuff.
That's writing code.
Maybe some ping pong in there. Yeah. Um,
and so, uh, as part of that, we're
actually launching a bunch of new agents
that do all of those things for you. You
can just sl you know type in uh slash
test and it'll actually automatically
write unit tests for you as you're
sitting there coding. You can have a Q
developer agent build write
documentation for you as you're writing
code. And so you can have really well
doumented code and you're done. You
don't have to go think about it. It'll
even do code reviews and look for where
you have uh risky parts of your code
where you maybe have open source or you
uh parts that you want you should go
look at and think about what the
licensing rules are around how you think
about where even from deployment where
you may want to think about how you're
deploying stuff. um things that you
would expect out of uh somebody doing a
code review for you before you go do a
deployment. Q can now do all that for
you. Same on the contact center side,
right? We're doing a ton of
announcements around connect, which is
our our our contact center in the cloud
offering, making it much more efficient.
So for customers to get a ton out of
that contact center, um all powered by
generative AI. And to your point, that's
just that it it you know, those so those
use cases I think get more and more
valuable as you add more capabilities.
And you know I think um if you think
about where things are going it is a lot
more about you know if you think about
how I how I talked about code generation
moving to a bunch of the value it's
adding agents in there so they can do a
bunch of these things right now it's not
just giving you code suggestions it's
actually going and doing stuff for you
right it's writing documentation for you
it's helping you identify and
troubleshoot where you have operations
issues and it says oo you have an
operations issue and it can look and
understand your whole environment you
can interact with it and you and and Q
together can go and look and say oo it
looks like some permissions over here
were broken and if you go fix those
maybe this is something that you can
automatically you know um it'll fix your
application so saving tons of time
across that whole development life cycle
and I think that that's where as AI gets
to be more integrated into the core of
what a business is the core of what you
do and you really have to learn it
that's where you get the value um
there's a startup uh but there's a
number of these doing that but there's a
startup uh that we work with called
evolutionary scale and they use AI
to try to discover new proteins and
molecules that may be more applicable to
solving certain diseases. Right? Now,
you think about not AI is not just
generating stuff or it's doing, but it's
actually sitting there instead of being
able to find tens or hundreds of new
molecules a year. You can now find
hundreds of thousands of different
proteins and test all of these and
figure out which are the most likely to
be successful and get drugs to market
much faster and and that's a huge amount
of additional revenue. So if you think
about models and capabilities that can
do that whether it's in health and care
and life sciences whether it's in
financial services whether it's in
automate um manufacturing automation
every single industry in our view is
going to be completely remade by
generative AI at its core and and that's
where we think that that's where you get
that that huge value. I
have a question about this. I was
speaking with a developer friend who
said yes AI can code
AI can do all these things probably
looking at the different things that
these agents can do. Yeah. The problem
is, and this applies probably across the
board, when you trust things to generate
AI, something breaks and then you've
lost the skill set to go in and fix that
because you've relied so much on the
artificial intelligence. What do you
think about that? Isn't that a problem?
What's 3 * 4?
12.
Yeah, you still have Excel, but you
still know how to multiply. Like I would
say that like maybe, but like you know,
you're able.
This is different than multiplication.
This is different. Again, it's it's
different, but I think the the key parts
of coding are not the semantics around
writing language, right? The key about
carts parts about coding are thinking
about how you break down a problem, how
you creatively come up with solutions
and I think that doesn't change, right?
The tools change. You can make you more
efficient, but you're the the developer
the core of what the developer actually
does is not going to change. You're
going to want to think about, you know,
there's not a lot of developers today
that know a lot about garbage
collection. It's just true. They don't,
right? Because they Java just does it
for them and they just don't have to
worry about that. Doesn't mean that all
of a sudden like if it breaks people
don't know how to do garbage collection.
They can go figure it out, do it. they
just don't do it as part of their daily
jobs and because it's not fun and it's
not value added and they can focus more
on that writing code right this is what
new languages have done but and so
increasingly I think developers are
going to get to do the things that are
exciting they're going to do the
creative work they're going to get to
figure out how to go solve those
interesting problems and they're going
to be able to move much faster because
they don't have to worry about writing
documentation and someday if it breaks
they probably will know how to write
documentation and we'll figure out how
to fix that is not rocket science it's
just things they don't necessarily want
to do Okay. Um, so you're a believer in
reasoning. I know that
AWS has some news also this week um that
you're going to have uh automated
reasoning test where it checks for
hallucinations before an answer goes
out.
Is this something that sort of cuz like
another issue when it comes to ROI is
again how can I trust it? It always
comes out with wrong answers. Uh so talk
a little bit about your announcement
this week and how reasoning can solve
some of these issues.
This is a it's a different reasoning
than you might be thinking about too.
So, automated reasoning is a form of um
artificial intelligence that uh it's
been around for a while and it's a thing
that um Amazon has has adopted uh pretty
significantly across a number of
different places and um and we use it.
It's actually what it does is it uses
mathematical proofs to prove that
something is operating as you intended.
Okay, that's the historical and you and
um uh an example of that is we actually
use it internally to make sure that our
permissioning system is actually when
you change permissions that it's
actually behaving as expected. And so we
have a it's a it's the this AI system
has this mathematical proof that can go
say okay all the places that permissions
are applied across a surface area that's
too large for you to actually go check
everything. It can prove that they're
applied in the way because it knows how
the system is supposed to operate and it
can go kind of mathematically prove yes
your IR permissions mean you can access
this bucket or you can't access this
bucket.
Um we took that and we said can we apply
that to AI to eliminate hallucinations
and so turns out not universally you
can't do it but for limit for for
selected use cases where it's important
that you get the answer right you can.
And so what we do is say an example like
you're an insurance company, right? And
you want to be able to answer questions
about people. They say, "Hey, um, I have
this problem. Is it covered?" Right? You
don't want to say yes when the answer is
no or vice versa, right? And so that is
the one where it's pretty important to
get that right. Okay? Uh, what you do is
you upload all your policies and all
your information into the system and
we'll automatically create these
automated reasoning rules. And then
there's a process you go through that's
a couple minutes, kind of 10, 15, 20, 30
minutes where you as the the developer
answer questions of how it's supposed to
interact, right? you and you tune it a
little bit. Say, "Yep, that's how you'd
answer that type of question or no," or
"That's what this means." It'll ask you
questions and and you kind of interact
with it and then it goes, "Okay, now I
have a tuned model." Now, if you go ask
it a question, you say, "Hey, I you
know, I ran my car through my garage
door. Like, is that covered by my
insurance policy?" Um, it'll go and
it'll actually produce a response for
you and then it'll tell you that yes,
this is provably correct that the answer
is yes, and here are the reasons why and
the documentation I have and why I feel
confident in that. or it'll tell you
actually I don't know the answer. Here's
some suggested prompts that I recommend
you put back into the engine to see if
you can get the answer correct because I
can't I can't tell you. I came up with
yes, but I actually don't know for sure
that it's the right answer. Change the
prompts and it'll give you kind of tips
and hints on how you can re-engineer
your prompts or ask additional questions
to come back until you get an answer
that's a for sure answer that's uh
provably correct by automated reasoning.
Right? So by that by this kind of
mechanism you're like systematically
able to actually uh mathematically prove
that you got the right answer coming out
of this and completely eliminate
hallucinations for that area right it
doesn't mean that we've eliminated
hallucinations all just for that area
yeah if you go ask it then you know
who's the best pitcher on the Mets it
may or may not answer your reasonable
question
right maybe there is no correct answer
to that one as
although pretty good season this year
but let me ask you the what you're
talking about also is very similar to
what Mark Beni off talked about on the
show last week where he said that
because companies have large stores of
information
within his platform, agents will be able
to go in and pull it out and then
present it and sort of help create a
linkage to go from step A to uh step B.
And it was interesting to me because I
had always thought agents are going to
be something that maybe built by
anthropic where it's my individual agent
that goes out into the world and does
what I need. And I think both you and
Benny ofty often, correct me if I'm
wrong, have this idea that the agent is
going to be something that I'm going to
interact with when I'm speaking with the
company or actually is going to perform
tasks at work. Maybe that's going to
happen before consumers get them.
Yeah, I think that that's right. I think
that agents are going to be a really
powerful tool. Actually, another thing
that we're launching this week is, you
know, one of the things that agents
today are quite good at doing relatively
simple tasks, right? and uh and you can
have an agent that goes and and what
they're very good at actually is tasks
that are pretty well defined in a in a
particular narrow slice and go
accomplish something. And so um what a
lot of people are doing is starting to
launch a bunch of agents, right? One
that's very good at going and doing you
know one particular task, another one
that's good at another task, another one
that's good at another task. But
increasingly you actually need those
agents to interact with each other,
right? So, we have an example in my
keynote where we talk about if you're
thinking about should I launch a coffee
shop and you actually you're say you're
a global coffee chain, you want to say
I'm going to launch a new location here.
You might have an agent that goes out
and investigates um what a uh you know
what the situation is or particular
location. You might have another agent
that goes and looks at what are the
competitors in that area. You may have
another agent that goes and and does a
financial analysis of that particular
area. Another one that looks at the
demographics of that zone, etc. And
that's great. So now you have like half
a dozen dozen of the agents that go and
do a bunch of these things. Saves you
some time, but they they actually kind
of interact with each other, right? Like
the demographics may imply like they may
they may change your financial analysis
as an example. And so that's super hard.
And then if you want to do it across 100
different locations, see where the best
one is, that's also hard to do like and
it's super hard to coordinate because
actually those also may be interrelated
too. Like uh you know putting a coffee
shop here and then another one two
blocks down may interact with each
other. they can't be independent. So, we
launched a multi- aent collaboration
capability where you basically have this
kind of super agent brain that can
actually help collaborate across all of
them, break ties between them, help like
uh pass data back and forth between
them. And so we we think that this is
going to be a really powerful way for
people to really accomplish much more
complicated things out there in the
world with again there's a a fundamental
model under the covers that's driving a
bunch of this reasoning um and breaking
these jobs into into individual parts
and then the agents go and actually
accomplish a bunch of this work.
Okay. I'm just going to say before we go
to break I appreciate how much news that
you're weaving into this. This is the
ultimate um number of keynote
announcements that have been
introduced into a podcast. So thank you
for that. All right. Welcome to
reinvented.
Exactly. All right, we're going to take
a quick break and come back with Matt
Garmin, the CEO of AWS.
And we're back here on Big Technology
Podcast with Matt Garmin of AWS. Let's
talk about uh some broader some um you
know more earth centric topics and
starting with nuclear. Y
uh you have invested 500 million in a
company called X Energy to do nuclear.
Um you're also part of I would say a
wave of companies that are reanimating
nuclear energy in the United States. And
part of that is because these nuclear
plants just didn't they had excess
capacity that they needed to get off
their hands.
Um I just want to ask you a broad
question about should we really believe
in this moment for nuclear because on
one hand it's for the moment cleaner
than fossil fuels. On the other hand we
don't really know what happens with
nuclear waste. We can't get rid of it.
It has to sit in silos that could be
damaging for the planet uh over time.
So, is it really, and this is sort of a
sensitive one, but is it really an
improvement to go to nuclear? And how
can we be sure because of the long-term
effects here?
Yeah, look, I I think nuclear is a
fantastic option uh for clean energy. It
is a carbon zero uh energy that has a
ton of potential. as you look about the
energy needs uh over the next couple
years and really the next couple of
decades whether it's from technology or
or or broadly in the world whether it's
electric electric cars or just the
general electrification of of lots of
things in our world we're going to need
a lot more energy and um it's uh you
know we we at Amazon are one of the
biggest investors in renewable energy in
the world in the last 5 years we've done
um 500 over 500 renewable projects where
we've added and paid for new energy to
the grid whether they're solar or wind
or others. And so, you know, we and
we'll continue to continue invest in
those projects. I think they're they're
super valuable. And there's probably not
going to be enough of those soon enough
for us to really get to where we want to
get from a clean energy perspective. And
so, I think nuclear is a huge portion of
that. Um, you know, look, there's always
the the fear-mongering from like back in
the the 60s and 70s of what nuclear used
to be. Nuclear is an incredibly safe
technology today. It's much different
today. Turns out technology has changed
in the last 50 years. It's improved a
lot. And so there is a ton of um
improvements in that space. And we think
that it is a uh both a very safe, very
eco-friendly um energy source that that
is going to be critical for our earth um
if we're going to keep uh for our world
as as we keep ramping our our energy
needs. And we think that as part of that
portfolio, right, you're going to have
solar, you're going to have wind, and
you're going to have other but nuclear
is going to play an important role in
that. And um and we're excited about
what that potential looks like. You
mentioned X energy. Um we do think that
um you know over the next um probably
starting somewhere in in 2030 and beyond
um these small modular reactors which is
what um X energy builds are going to be
a huge component of this. And so one of
the and and there they'll be part of
that portfolio of offerings. But today
uh all all these nuclear plants that
people build um are really large
implementations, right? They're
multi-billions and billions of dollars
to go build these energy plants and they
produce lots of energy which is great.
Um, but they're obviously all that
energy is in one location and then you
have to invest in a ton of transmission
to get the the energy to the actual
place you need it to go. Um, and they're
big projects. These small modular
reactors are much smaller. Um, you can
actually produce them almost like you
produce gas turbines like in a in a
factory type setting eventually. And um,
and you can put them where you need
them, right? So you can actually put
them next to a data center where
transmission is not going to have to be
an important factor. And so we think
that that's a a great um solve for a
portion of the world's energy needs as
we continue to evolve over time and um
and it's one of the components of a
energy portfolio that we're very excited
about.
Okay. So we'll be watching that closely.
Yeah.
On the state of the economy, AWS had a
few quarters of stagnant growth. It was
still impressive growth, but it
flatlined for a moment and part of that
was because customers
not quite flatlined, but it was down
from where it had been.
Okay. But that's I'm just talking about
the percentage of things.
Yeah. Yeah. Um, part of that was because
the economy was in a rough moment.
Everybody was looking for efficiency.
And so what you did was, I think you
made some deals with customers to help
get their bills down or help get them
the most out of what they were doing so
they could, you know, effectively live
that efficiency. That's right. Motto.
Um, what does it look like right now? Is
the economy back or Pimple's still in
efficiency mode?
Yeah, I'd say. And by the way, it wasn't
even just deals. We we went and
proactively jumped in with our customers
and helped them figure out how they
could reduce their bills and uh we
looked about where they could
consolidate resources, where they could
move to cheaper offerings, where they
could maybe do more with less. Um and we
we were really proactive about helping
customers reduce those costs because we
thought um from our view uh one as
important for them is they thought about
how they got their economics in the
right place and it was the right thing
to do for them and and built that
long-term trust. Now customers I think
um number one a lot of them have been
optimized right and there's only so much
you can kind of squeeze into an
optimized place and customers are still
looking for optimizations but a lot of
that work has been done and they're
using some of that optimization to help
fund some of the new development that
they want to do a lot of that is in the
area of of AI much of that is in the
area of migration and modernization
where they're moving from on-rem into a
cloud world and so some of those
optimizations they did are helping them
fund some of that work that's moving
more of our workloads to the cloud
that's move and letting them go and and
build new AI experiences um in AWS and
so that is where you've seen our growth
uh start to come back up as a percentage
basis um some of that is customers
leaning into those new experiences and
and doing some of those more
modernization migrations
okay I want to wrap on a culture
question
okay
Andy Jasse recently emailed the company
and he said that as a consequence of
scale and I'm going to get it exactly as
he said it uh he says um there have been
premeings uh for for pre meetings for
the decision meetings. A longer line of
managers feeling like they need to
review a topic before it moves forward.
Owners of initiatives feeling less like
they should make recommendations because
the decision
will be made elsewhere. Uh was that
going on within AWS and what is the
process to change that?
Uh yeah, I think it's across across
Amazon. So it wasn't specific to to the
rest of Amazon. It was definitely inside
of AWS too. Uh, and you know, I think
look, it's a it's kind of a natural um
evolution like we have these leadership
principles inside of Amazon. And I think
one of those ones that's important for
us, a couple we have a couple that are
are things like being customer obsessed
and really understanding the customer.
And in order to really understand the
customer, you've got to be close to the
customer. And so a flatter organization,
the more layers you have, the more
removed you are from customers. And so
we just kind of fundamentally as we were
growing and then we went through an area
of explosive growth of just the number
of people and and the size of of the
company and and the size of the
business. And so throughout that we just
didn't always have the organizational
structure exactly right. And so it's,
you know, we we we believe that a
flatter organizational structure is
better. The closer you are to the
customers, the better decisions you're
going to make, the faster decisions are
you going to make. And you really want
ownership to be pushed down to the
people who really are are making some of
those decisions. and and when you have a
a very kind of um hierarchal
organization um where people don't feel
like they have that ownership to make
decisions, you go slow. And for us,
speed really matters. And so um I think
uh Andy was just highlighting some
observations we had where, you know, I
think he's he's incredibly thoughtful on
these points and and which I appreciate.
We've had a lot of debate here where
it's nothing is broken, but you could
see like really early warning signs or
stress around it. And for us, culture is
so important and doing the things in the
right way, being that customer obsessed,
being ownership, like having the right
level of ownership so important for what
makes Amazon so special. And so kind of
getting ahead of there being any
problems. It's not like there was any
burning problem. And we obviously could
have just said done nothing and kind of
let things go for a while. But for us,
it's not the Amazon way.
It's not the Amazon way. And so we we're
just being proactive identifying that
like, hey, look, this is super important
for us. And so let's just be aware of
it. Let's be like be upfront about it.
think about it and be very intentional
as we think about organizational
structures and things about where we can
um uh land and uh and I think all of
that has been received really well
because it turns out not many of those
things uh customers complain about. They
they are really focused on um ownership.
They love being customer obsessed and uh
and and most of that has been quite well
received.
So you can be a big company but not have
big company culture.
That's right.
Um okay, last one before we go. uh
you've said that less than 20% of all
workloads have moved to the cloud so
far. What is the max number that that
can that be 100% in time?
I was going to say 100.
No, but what's realistic?
Um yeah, you know, I I think um it's a
good question. I think uh if you think
about how many workloads are out there
um uh I don't know what the max is. I'm
very bad about picking the maximum size
of
might be have some sort of building and
I actually think that at a minimum um I
think that that percentage could flip
and it could be 8020 versus 2080 where
it is today or even less. Um I think
there's a massive number of applications
that just haven't moved. And if you
think about um you know line of business
applications, as you think about
workloads that are in telco networks, if
you think about workloads that are
running inside of hospitals, if you
think about like it's not even just
traditional data center workloads, but
there's a lot of these other workloads
that be that would be much more
valuable. they'd be much more connected.
They'd be much more um able to take
advantage of advancements in AI if they
were connected into the cloud world and
and running there. And so I think that
there's a huge opportunity for us to
continue to expand what it means to be
in the cloud and to um and to continue
to migrate many of these workloads that
are um that just haven't moved. And so
um there's a massive opportunity I think
you know I think kind of flipping that
percentage over time could be an
interesting opportunity for us. And the
size of the pie is getting bigger too. I
think that's the other exciting thing
about generative AI is that the total
amount of compute workloads are actually
significantly accelerating too
and timeline to flip
uh still yeah I still think we're still
ways out for the whole thing to flip.
There's just a massive amount of
workloads out there but but we'll keep
working on them and and and keep going
as fast as we can.
Matt Arvin, great to meet you. Thanks so
much for coming on the show.
Yeah, thank you. All right, everybody.
Thank you for listening. We'll be back
on Friday breaking the down the news and
we'll see you next time on Big
Technology Podcast.