Amazon's Longterm AI Vision — With AWS VP Matt Wood

Channel: Alex Kantrowitz
Published at: 2024-07-10
YouTube video id: JTcL6zgEK34
Source: https://www.youtube.com/watch?v=JTcL6zgEK34
the VP of AI products at Amazon web
services joins us to discuss what people
are actually building with the
technology and whether it's worth the
investment all that more is coming up
right after this welcome to Big
technology podcast a show for
cool-headed nuance conversation of the
tech world and Beyond well a year later
we have Matt Wood back with us today
he's the VP of AI products at Amazon web
services last year we spoke at the AWS
Summit in New York City all about
Amazon's AI strategy and we have a great
opportunity now to talk a little bit
more about where the AI field is heading
a year later looking back at what we
discussed last year but also really
where the field is at the moment and
where it's heading Matt welcome back to
the show great to see you good to see
you too thanks for having me back this
is uh this is awesome and congrats on
the growth of the show it's been uh it's
been amazing I listen I listen every
week so it's a pleasure to be here
thanks so much oh that's awesome uh so
you'll have some context here so let me
ask you I think the most pressing
question that I have first which is we
when we spoke last year you said I
wouldn't be surprised if just the AI
part of our cloud computing business was
larger than the rest of AWS combined in
a couple years so I'm actually curious
where it is today but before we get into
that here's the sort of Disconnect I
have so obviously we spoke last year
there was all this potential with AI
we've been talking about it on the show
a lot and yet I was just speaking with
the colleague of yours who referenced
this Gartner study that said only 21% of
AI proof of Concepts so the different
programs and products within companies
actually go into production that's a one
in five uh rate which is which is not
great given how much effort and money it
takes to get these things going um so
talk a little bit about like where the
potential and the state of the AI uh in
uh AI I guess the state of AI building
today why there's that disconnect and
where we might be
heading yeah happy to talk through it
from from my perspective I've been uh
very fortunate over the past uh year or
two to talk to literally hundreds of
customers in every single industry uh
and I have honestly not seen this level
of energy and enthusiasm uh for any
technology probably since the Advent of
the cloud uh from customers uh most
customers are investing uh very
diligently uh they're making good
progress um there is a group uh which is
moving slightly faster than the average
which is somewhat counterintuitive uh
and that group is actually the the
regulated Industries and so it's folks
like uh in financial services and
insurance and health care and life
sciences and Manufacturing uh and
they're able to move a little bit faster
in part because all the regulations that
have been they've had to comply with
over the past 20 years that probably
felt at the time like a bit of a
headwind have actually driven the right
set of behaviors for that group uh to be
successful with generative Ai and so
they have you know all of the governance
of their data figured out they
understand the quality of their data
they understand which data can be used
where by whom and for what purpose uh
they have very very large amounts of
private text Data uh exhibites of the
stuff in some cases uh which are market
reports or clinical trial results or
Insurance document life insurance
documents those types of things that the
models have never seen before but are
really good at looking at and reading
and summarizing and connecting the dots
and finding disconnects uh and that just
earlier in their kind of digital
transformation journey and so they've
probably looked across and felt like
they were kind of sitting uh to the side
as you know other areas like uh retail
and uh transportation and hospitality
and media kind of went through this very
aggressive digital transformation over
the past 10 years or so driven by the
web and driven by mobile and a lot of
other factors including the cloud uh and
these organizations are looking to not
just use generative AI to catch up but
to actually Leap Frog ahead uh use the
data that they have which is which is
privately held so that that's one area
that I think is um probably a little
counterintuitive I don't think I would
have guessed you know even a year ago or
two years ago that uh a you know 160
year old life insurance companies would
be in the Vanguard of really delivering
value through generative AI right you
they have these very very large document
stores of um you know 90-year-old life
insurance documents which probably going
to pay out in the next decade or so and
they've been scanned at some point point
but no one's ever read them and they're
not sure what level of risk is
associated to those documents inside
their business and so they're able to
use generative AI to be able to you know
piece that that risk together and
understand it more more completely so I
that you're going to say like the
structure the companies with structured
data um who have have it very organized
and have this partnership already are
going to be the ones that are going to
benefit the most which is that makes
sense but then there's also like some of
the more glaring issues change
management is difficult the model still
cost too much to run and then um where
they're not they're not quite good
enough yet like last year and we're
going to get to it but last year we were
talking about agents and all these other
you know Advanced use cases and they've
clearly not hit the way that they are
supposed to so what what do you think
about these limitations aren't they the
main things that are holding back the
field versus just like getting their
data in order well I think those uh
there are limitations to the technology
today and part of being successful with
the technology is understanding those
limitations at a deep level and you you
referenced I'm not familiar with the
with the details but you referenced kind
of 20% of prototypes you know going into
into production um honestly that sounds
pretty good to me like if you think of
just the amount of experimentation that
is happening inside organizations around
generative AI just the number of
experments that are being run on AWS uh
for different companies in the regulated
Industries and all the other industries
that I mentioned you know uh Bedrock
which is the service that we make
available to customers to build
generative AI applications that's one of
our fastest growing Services ever and
all up Ai and machine learning at AWS is
already a multi-billion Dollar business
in terms of ARR so that there is a lot
happening and I think that that 20% is
actually pretty good because the
denominator is absolutely massive and
when technology shifts happen you really
do want customers to be able to to
innovate to be able to experiment really
safely really quickly with that
technology to find out what works and
what doesn't work and we're dealing here
with a technology which is uh just in
its very earliest days it's much more
like a a discovery than it is an
invention uh we discovered that if you
build these very sophisticated
mathematical models that there is
emergent Behavior within them that
resembles reasoning that resembles
intelligence and we're applying that in
some places and some of those
applications will turn out to be
successful and it is no surprise to me
at all that some of those experiments
turn out not to be successful because if
you're experimenting in the right way a
lot of those experiments are going to
fail and so it's why customers in part
turn to AWS for running some of these uh
some of these these workloads the
majority of these workloads because
they're able to broadly democratize the
way that these applications are built
using generative Ai and they're able to
validate the ones that work really
really quickly and then when they find
that 20% that works Works they're able
to take it into production uh very
quickly and and a very very large scale
with the right cost structure around it
as well and so I think that the the 20%
is a little uh misleading if you think
the denominator is small but that
denominator is massive because it's just
so much experimentation happening and we
see it on AWS and inside Amazon as well
and look this is where I always kind of
get tripped up because we talk about
these you know these big uh emergent big
things like emergent behaviors and
models being able to do reason and how
it's a discovery and then we talk about
okay so what practically are they doing
and it's like well they're coming
through insurance
documents you know shout out to all the
folks working in insurance and I'm sure
we have some listening to the show but
I'm like man if we made this disc I mean
if if people in the tech field made this
discovery that models are intelligent
and can think for themselves and like
the thing that's that's like one side of
this and then but then we ask when it's
applied and it's like well Insurance
adjustors are a little bit more
efficient and it's like can that because
we've the market has valued and the
industry has sort of started building
around these like Discovery and and uh
use cases the reasoning the emergent
behaviors but then when you ask
practical and it's like the most boring
applications you could possibly imagine
so is that gonna change it it's
interesting you say boring because
boring workloads are boring because
there's so freaking many of them they
just everywhere and so yes I absolutely
believe that there will be large step
function changes in a significant number
of industries that are going to drive
you know uh uh orders of magnitude
improvements for the organizations that
work on them and Society at large one
example is uh just computational biology
uh and we can talk about that in more
detail but the work that's going on
there in terms of using generative AI
are the likes of the Dana Farber Cancer
Institute or genomics genomics England
or visor or the work we've done with
startup called evolutionary scale to be
able to use generative AI to be able to
design entirely to entirely new
molecules to design entirely new
antibodies uh that are manufacturable
that can go on and find new drug targets
like that is a major opportunity and
step function it's early for sure the
company just went out of stealth they
just published their paper which is a
great paper recommend everybody read it
just for background on what's happening
in that field but I absolutely believe
that there will be many different step
functions forward in multiple different
industries of that of that format I also
think that there is a huge number some
of it's going to be longtail but just a
huge number of you know what you call
boring workloads that are going to be
completely reimagined through the use of
generative Ai and that's okay you
actually want a lot of that boring work
to be automated you want a lot of that
work to be improved you want to be able
to channel the boring work which has uh
maybe inside some organizations is seen
as a bit of a just as a cost center and
to be able to turn that on its head and
and channel it into something which
drives invention and drives growth and
this is exactly what we saw with cloud
computing in the early days as well I I
literally could have said that sentence
in fact I think I did you with the
Advent of of cloud computing that there
is you know a huge number of workloads
inside many many Enterprises that can
take advantage of not just the cost
Savings in the CL but can take advantage
of the agility in the cloud and take
something which is traditionally
considered a cost Center building out
data centers which offer no
undifferentiated value and turn it on
its head and drive the right cost
structure and the right agility to be
able to use that infrastructure to drive
new uh uh product creation new invention
and reimagination of all of these
different products and so what we
consider boring today is going to be
rechanneled in my opinion into you know
much more uh High leverage growth
opportunities for many organizations and
there's such a big change management
component to it as well right we talk
about the models right there's a cost
there's a capability of the model uh but
also you know one thing about trying to
reimagine how boring work is done is
there's a lot of people who are sort of
used to that work um what percentage of
the workplace do you think or the
workforce do you think is really ready
to like let's say this AI can can
revolutionize the way they do work what
percentage of the work Workforce do you
think is ready to take advantage of it
it's a good question um I'm not sure I
would Peg it as kind of ready I suspect
that uh whilst there will be these step
function changes over the long period I
think in the in the shorter term in the
shorter Outlook it's going to feel a lot
more incremental than we're probably
used to uh there's a there's an old
adage of um a story that folks tell that
when we finally discover that there's
life on another planet in another galaxy
yeah we all have this idea that this
will uh be a huge you know uh societal
shifting event for the planet that we
discover there's life on other on
another planet but in reality I suspect
there's just going to be lots and lots
and lots of small iterative
announcements and that when the NASA
press release comes out that there's
life on another planet it will seem
really obvious at that point and it'll
from from now to when that eventually
happen Happ s yeah that's a really big
jump but incrementally we'll get there
incrementally not not in one big shift
and I think the same thing will apply
here there will be over a longterm like
big incremental shifts in how we deliver
products and how we deliver technology
and how we interact with data and
information and each other but it'll
probably appear kind of incrementally
and having patience and having a
long-term view allows you to drive more
of that value incrementally and allows
you to experiment more and it allows you
to uh have big goals and kind of you
know iterate yourself to iterate your
way to Greatness and having that
long-term view I think is to get back to
your question one of the most important
cultural shifts that organizations will
need to make you're going to need to
have the right teams for sure you're
going to need to have the right Talent
you're going to need to have the right
technology and you're going to need to
partner with the right uh the right
organizations to be able to drive that
technology but having the ability to be
able to take a long-term View so that
you can allow those creative inventive
Builders to be able to use that
technology to be able to iterate and
improve and experiment and invent like
that requires discipline from a
leadership perspective it allows uh it
requires you to set up kind of small
blast radius experiments uh and it
requires the organizations to be very
tolerant to that failure because
experiments failing you've learned
something there if you've set it up
right and that learning is is dispropor
valuable at this point in the uh the
kind of Technology cycle and so that
cultural element that you outlined is
absolutely critical I'd actually say
it's more like 50% technical 50%
cultural in terms of the waiting of the
uh elements of investment that are going
to be required to be successful so I'm
not sure exactly what percentage right
now is kind of ready uh I would guess if
I had to put a number on it I would say
it's probably 25% 35% in most
large-sized Enterprises um but over time
you you know if you look 3 years out 5
years out 10 years out whatever it might
be with that long-term Horizon my guess
is it's going to be 100% yep okay I'm
going to uh ask a follow up on that but
first you believe in
aliens uh I I think uh I think you have
to believe in aliens if you understand
just how big the universe is it just
seems incredibly unlikely that we have
hit the absolute Only Magical sweet spot
in the whole Universe uh to encourage
you know Carbon to be able to animate
and dance around as we do as humans
every day so the the probability of it
just being limited to Earth seems very
very unlikely although I acknowledge the
Paradox that if there's life out there
you know where is it uh so that's why I
kind of like that yeah they could also
all be be dead and we might be like
right now like the only living I mean I
think there probably are some sort of
life forms out there that have existed
did in the universe you know either
before or will come after but to have
them exist concurrently is that's the
question I agree yeah yeah yeah go ahead
I just want to get back to the AI stuff
I guess we could do another show on
Aliens um I would love that
goodness all right so um you're what
you're saying about patience
incrementality uh you know 25% of the
organizations being ready and replacing
the boring stuff that all sounds good
but it also makes me wonder if we're
going to end up in a sort of trough of
disillusionment with this technology
because there's been so much hype and so
much money that have poured into it that
are demanding almost a revolution now
and what you're describing isn't a
revolution or isn't a quick moving
Revolution it might be a slow moving
incremental uh sea change but not
something that happens immediately it's
not something that um you know the Wall
Street types for instance will be like
thrilled to know that it's just going to
take a while because they think in
quarters so do you think we there's a
risk here and in like within the next
few years sort of the public uh
perception of this technology turning a
little bit because of the incremental
incremental nature of
it I I think it would be a possibility
if and it's a huge if if the technology
wasn't poised to improve so if if what
we if you believe that what we have
today is pretty much what we're going to
have to work with with only incremental
small improvements over the next 3 5
years you know then I suspect that you
know folks will feel like you know the
the the promise on this occasion hasn't
been delivered on but you know
technology tends to follow an scurve
over time and you know you uh you get to
the top right hand corner of that S
curve and you end up with the technology
with the capability and you get these
you know just decreasing improvements
over time time um you never really know
where you're at on the scurve until
you're looking backwards and so it's
kind of hard to judge where we're at I
think most people would think we're
probably in that kind of middle section
High gradient piece just because there's
so much happening and there's so many
yeah so many improvements there's new
models and new techniques and new
technologies from Academia and the
public sector private sector um and I
have no doubt that by the time we finish
this conversation there'll be another
technique out there that is worthy of
our attention um but my guess is that
we're it's probably more likely that
we're at the bottom leftand corner I
don't think we've hit the kind of hockey
stick inflection point yet of what this
uh what this technology is capable of
it's still very very very early so uh at
some point we're going to hit that
hockey stick inflection point and it
always happens with different technology
shifts uh it can take uh more or less
time depending on the shift and the
speed of the technology if you look at
you know the and the thing that triggers
the s-curve band is different in a in a
number of different ways so if you kind
of look at the maturation of the
internet itself you know that that
hockey stick inflection point it really
I think landed with the uh development
of kind of SAS style Web 2.0
applications whether it was whether it
was web mail or whether it was Finance
systems whether it's hotel booking
systems whatever it was that capability
of being able to have access to those
types of services that made uh and the
fact that you could integrate those
Services kind of through apis and do
interesting things with them that meant
that every new service that was added to
the internet made all of the other
services more valuable and that's kind
of what pushes you up the S curve in
many times you so the same thing with
kind of the mobile transformation where
we had these remarkable new devices we
had these these these applications that
more and more people invested in more
and more organizations invested in they
became more and more sophisticated and
over time the operating systems on which
those applications ran allow those
intera those those applications to
interoperate and interact in interesting
ways both with the operating system and
with each other so every net new
application added makes all of them more
more more useful makes the whole uh
system more useful the whole device in
your pocket gets better over time
without you having to do anything and so
that pushes you up the S curve as well
and I don't think we're at that point
with generative AI yet we have a really
robust set of really interesting really
powerful models which are going to
mature over time but uh customers will
I'm sure find interesting ways to
combine those different models there
there isn't one model to kind of rule
them all each different model has you
know different sweet spots and it's my
expectation that most customers will
invest in not building the foundation
models but we'll invest in fine-tuning
and improving those individual models
and customizing them in interesting ways
for their own use case and and they'll
that those capabilities are interesting
in isolation but part of what will push
us up the scurve that we're seeing with
customers at AWS and at Amazon is that
combining those models together leaning
Into The Sweet Spot of all these
different models uh uh allows you to
build systems that in aggregate are have
a compounding effect on intelligence
it's not additive it's a multiplier and
so that's going to push us a little bit
further up the S curve I think another
really interesting area and the one
that's probably closest to SAS
applications and mobile apps is what you
mentioned earlier is Agents I think
agents have a good chance of being the
apps for the generative AI world and the
generative AI era and that as we add
more of those and we find ways to
orchestrate multiple agents together and
there's already customers that are
Building multi-agent Systems on AWS
today that combine Specialties and
combine agents that can goal seek on
your behalf and collaborate or contest
with each other in interesting ways that
means that every new agent that's added
to the system drives you up the scurve
it makes all the other agents you know
more useful at the same time without you
having to do anything but are agents an
actual thing in production
now yeah I think so uh you we have um
like I'm just I'm just going to say like
last year we spoke about like you made
you made an announcement about how
agents were uh agent Building Technology
was on its way and it's just like a full
year has gone by and I haven't seen one
example of like a of a realistic agent
going out there and taking action for
people uh well I think um I've certainly
seen some uh I use some on on a day do
these discussions yeah yeah for sure uh
I i' I'd recommend you check out a
couple things um uh that may be
interesting to you and the the audience
uh one is a startup company called ninj
Tech you can check them out at ninj ninj
tech. uh they have a uh a an assistant
assistive system you can interact with
natural language as you may be familiar
with um but they also have under the
hood a set of specialized agents that
can perform different tasks on your
behalf uh so they have a they have a
researcher agent they have a scheduling
agent they have a a web agent all sorts
of different agents and just by asking
your question they interpret the
question and then they have an agent
which looks at the response and says hey
this looks like you're doing some
research let me ask my researcher how I
can best help you and I'll set a a
problem to my research agent and that
research agent runs off and does its
thing and they say oh there may be some
web data that will be useful I'll set my
web agent off to go and collect that
data for me and so on and so forth and
it pulls back all the information
together and allows you to interact with
your with your calendar and with your
schedule or your email or whatever it
might be in levels which are much more
uh automated than than you could do with
just a standard assistive chat bot one
example is so useful then why hasn't it
broken out into the public yet I just
think it's very early you know today uh
agent systems or agentic systems as
they're sometimes called um that it's
still relatively early um but I think
they are breaking out to be fair I think
ninj Tech is seeing you remarkable
growth they have hundreds of thousands
of monthly active users we've also built
some really um uh uh powerful and
popular agents on AWS and so we have a a
an assistant for Builders that we call Q
Amazon q and Amazon Q allows you to uh
uh generate code if you're building
software uh it will take a question and
give you answers and give you guidance
on how to build on AWS and all the
things you would expect and that's
useful that gets you a bump in
productivity uh we've seen some
customers get you know in terms of just
the amount of code that is automatically
generated that they accept it's usually
between 35 and 50% uh it's higher on Q
than than any other uh comparable
service but the thing that drives
productivity for developers is what we
call the developer agents inside q and
so with a developer agents you don't
just ask a question about what code to
write or write a comment and get the
function back you actually set a task to
Q you say to Q hey I want to add this
feature to my software uh his Q looks at
the software across uh across your
repository it looks at the changes that
you've made inside your development
environment it understands the type of
change or the type of feature that you
want to make and it goes off and it
looks at all of that information and it
makes a a strategy it doesn't just
generate the code it makes a strategy
for how to add that feature so so it
picks which functions need to be updated
which uh modules need to be added which
tests need to be run which documentation
needs to be added and you get a chance
to review that strategy and at some
point you can just say hey Q go for it
and Q will go off and uh uh work through
diligently through its uh through its
to-do list uh to create a set of uh
software changes that you can choose to
commit uh which add that feature to your
code and so if you can imagine a
developer going from having to just
write or generate that code manually to
having you know tens or dozens or
overtime hundreds of those developer
agents running around doing the work on
their behalf you get this you know
combinatorial explosion of productivity
we do the same thing for code
transformation and so if you want to
move between different versions of java
yeah we support that today you just say
hey update this to be compatible with
Java 17 whatever you're running um it
will go off and make that same strategy
it will work diligently through it and
then allow you to review the results and
you can choose to accept those and
commit them back and that's a that's a
fixed cost effort that most
organizations have to go through we need
to move software project a from java X
to Java y it's going to take 10 people
it's going to take three months and
we're just going to have to it's just a
cost of doing business we're just going
to have to pay that cost pay that cost
in people pay that cost in productivity
and this is a task that no developer
really likes to do uh it's kind of toil
work and the very best def boring stuff
that we talked about it's boring exactly
uh but it's super impactful because
there's so much of it and so you move
from a world where you know you have
this fixed cost that you just have to
pay a cost center just like we were
talking about earlier and you move it to
a point where that is just taken off the
table it's completed automatically and
those same developers get back to
actually you know moving to do things
which are much more productive instead
of that work so we have we have Java to
you named it uh Q because of Q from Star
Trek not qinon right can we s it's it's
neither it's neither but what was the
inspiration for the Q uh it's uh based
on a Quarter Master the idea of a
Quarter Master where you get your
gadgets okay you guys couldn't have
picked a different letter it's a very
controversial letter these days yeah I
think it'll work out okay okay we're
here with Matt Wood he's the VP of AI
products at Amazon web services on the
other side of the break we're going to
talk a little bit about Amazon's
products and also where the models are
going next so stay tuned we'll be back
right after this and we're back here on
big technology podcast with mattwood the
VP of AI products Amazon web services
all right Matt so last year we were
talking a little bit about Bedrock which
is basically a tool that Amazon web
services uh customers can use to develop
with AI models um and the idea that you
explained to me was basically Amazon's
play for generative AI was that people
could who want to develop on AI could go
in and pick their own models uh through
through bedrock uh it could be
Facebook's Lambda or Amazon's
proprietary models or any host of other
models and then they could build uh that
way
but Bedrock has not integrated open ai's
uh GPT models yet or Google's Gemini
models yet and I was speaking with
someone in the know who was basically
like look like what they're offering is
not really Choice it's like one model
that works well which is anthropics and
they're leaving out the other
state-of-the-art models which is you
know open ai's GPT 40 and then Gemini
and ultimately that means that the
offering is limited and in some ways
behind and I'm curious what you think
about that
argument uh I would obviously disagree
that it's behind I think um you know the
the interesting thing about these models
is that you know they can they can be
very seductive uh when you look at a a
model in isolation you know you can you
can read the benchmarks and you know
tribes are forming around these models
and all those sorts of things but uh
what we see time and again with u
customers Enterprises startups who are
actually building with this in uh in
meaningful ways is that um they have a
huge number of different workloads you I
work with uh with some customers and
they're very generous and they send me
their their road map of all the things
across the company that they want to be
able to apply generative AI to and it's
a spreadsheet of you know 5 600 rows of
all the different things that they want
to do with generative Ai and you know
it's it's kind of um it's kind of
intuitive if you if you play that out
that there isn't going to be it seems
very unlikely that there's going to be a
single model that's going to be the best
fit for all of those different workloads
you know some of those different
workloads have different requirements
some have requirements that are you know
heavy on reasoning or heavy on the
ability to be able to do analysis others
need to be really good at summarization
others need to be really really fast
others need to be very low cost and so
there's there this this multiplicity of
use cases that have different uh
operational characteristics whether it
is intelligence or latency or or cost
whatever it might be and customers want
to be able to usually map the model to
the mission they want to be able to find
the right model for their use case
because if you have a small number of
models or just a single model available
to you it ends up having to play the
role of kind of a Swiss army knife and a
Swiss army knife sounds great it's great
in a pinch but in reality you almost
never want a Swiss army knife what you
actually want is a broad tool B tool
belt with all the specialized Tools in
there that are a perfect fit for what
you're trying to do if a contractor
turned up at your home to do some
Renovations and all they had was a Swiss
army knife I think you'd be pretty
disappointed with their preparation
probably pretty disappointed with with
their with their work with their work
quality as well them home that's right
exactly same thing with AI models you
want to be able to match the right model
to what it is that you're trying to do
so you can lean into the advantage of
that model in whatever it might be now
some of those models you really do want
you know as much intelligence and as
much reasoning capability as possible
and uh on Bedrock we make available the
uh the anthropic models the particularly
Claude 3 and the new clae 3.5
improvements which drive you know not
just great A great experience for you
know uh uh High Intelligence
requirements but are the best performing
models out there you know Haiku Claude
3.5 Haiku outperforms all other models
on the planet and so uh that's that's
great and you also want models which are
really really specialized for a specific
task and so we're making The
evolutionary scale models that I talked
about earlier they're available on AWS
today and we're going to bring them to
bedrock later this year we have
summarization models we have models
which are specifically tuned to build
agentic systems we have models that are
specifically tuned to work with
reasoning we have other models that are
just really really really really cheap
we have models that are multimodal and
will have handle different modalities we
have single modality models we have
large models we have small models and
time and time again we have seen at AWS
and this is an Insight that I think um
uh maybe some other providers have not
yet had uh um but because of our
background in cloud computing we really
recognize the value of optionality for
customers every single time we have uh
ventured into a new domain uh customers
have time and again told us that they
value the optionality of having purpose
built Solutions like being model
agnostic is definitely a
crucial aspect devel basically you could
swap being able to swap in any
model uh I look at it I I I look at it
more swapping models I don't think is is
quite the same thing I look at it more
like for each individual use case you
want to find the right so if you well
it's working well you know Bedrock is
our one of our fastest growing Services
ever uh we have tens of thousands of
customers that are using it today it's
growing growing growing like crazy and
it's really based on this observation
that we carried over from our cloud
computing work you when we started when
we launched ec2 uh which is our elastic
compute Cloud it's our compute platform
on AWS uh we launched with with just a
single compute type in a single
availability Zone just one that was it
that's all you could use um but you know
the goal was because we saw it
internally at Amazon and customers very
quickly told us that one single Choice
was not what they needed uh and so we
today we have over 400 different inance
types but if this choice is working so
well I want to ask you then there's a
question I've been meaning to ask you
for quite some time which is that maybe
it's maybe it's Mo limitations of the
models on the platform or maybe it's the
evolution of the models uh but Amazon
worked on something called uh Bloomberg
I mean Bloomberg worked on something
called Bloomberg GPT GPT on iws and this
is from Ethan mik he's a professor at
waren who studies this stuff he says
remember Bloomberg GPT which was
especially trained Finance llm drawing
on all of Bloomberg's data made a bunch
of firms decide to train their own
models to reap the benefits of their
special information and data here's what
he says you may not have seen that gp4
the old pre-turbo version with a small
context window without specialized
Finance training or special tools beat
it on almost all Finance tasks so I
guess I'm curious from your perspective
is it the fact that you didn't have the
right models or that the models are
Advan advancing so fast that something
that could take that much effort to
train through this process that makes a
lot of sense could then eventually be
surpassed by the next evolution of model
from open
AI well for context those two models
were what 12 maybe 18 months apart
something like that and today it looks
like models have a a shelf life of
probably about six months if you're
training on kind of open open web data
and it's partly why we like working with
our friends at anthropic so much you
know they are committed to continual and
consistent Improvement of all of their
different models and you know they
launched the
clber though then why would I develop
this on you know bespoke model if I
could be then surpassed by an
off-the-shelf
model well again I I suspect that I
don't know for sure but I suspect that
for General World Knowledge Questions
you actually do want a model which is
Tred on World Knowledge that's really
really useful but that World Knowledge
is is very very very broad uh but it's
not particularly deep and most
organizations operate at depth and so
there will be questions for sure that
you can pose to multiple different
models and larger more modern world
models I'm sure you can find examples
that they will outperform specialized
models and I am absolutely positive that
the inverse is also true that you can
find older smaller specialized models uh
that will will offer much better higher
quality lower hallucination results on
specific tasks at the depth that most
organizations need to read and so it's
an and not an awe and so if you if you
follow this train of thought where there
is a single model that is going to quote
unquote win I just think it's
self-limiting because you'll always end
up with that being the Swiss army knife
that that presents the the denominator
on your cas capability and that
denominator will uh is not guaranteed to
grow um in the depth that most
organizations need to be able to operate
in and so world and models are great
they're super exciting uh you want them
and you want the opportunity to
specialize those models and fine-tune
them you want to be able to build your
own models you want to be able to take
existing models and continue to train
them you want to be able to layer in
your existing data using retrieval
augmentation you want to be able to
adjust the alignment and style and tone
of these models in interesting ways you
want to be able to quantize those models
if you want to run them at lower cost or
on different environments so there's all
sorts of value in optionality and all
sorts of reasons why you might choose a
different model and so it that is a
really good example of where an and of
having different models is a really good
opportunity for for customers and you
must have a good insight into like where
the next level of models are going to go
I mean being so close with anthropic ear
to the ground um there's a lot of
expectation that the next set the GPT 5S
maybe the anthropic fours are going to
have uh sort of I don't know Godlike
capabilities that's what I like to refer
to it on the show but like that's the
anticipation what is the realistic
expectation for what's coming next on
the model
front I think it's a good question I
think we'll see a couple of different
things um I think we'll continue to see
uh improved reasoning capabilities uh
the ability to be able to take in larger
amounts of data reason across it with
very very high accuracy you know to be
able to answer increasingly complex
questions to be able to apply logic to
those questions we'll continue to see
Improvement in there uh I think that uh
I think that Improvement will come
iteratively you know kind of every six
months and probably much more quickly
because different model providers are on
slightly different schedule and so I
think those will continue to to improve
um I also think that um there is uh a
undervalued asset in the fact that these
models will continue to get better for
sure but you also want to be able to you
also want to be able to layer in your
own data in order to be able to get the
model grounded at the right level for
your organization and so the world as we
see things going forwards is that the
models will continue to get better more
capable more reasoning capabilities and
specialization and customization of the
systems built with those models will
become increasingly important and there
will be a sophist more sophisticated set
of guard rails which are mediating what
the models receive and what they
generate on the outside and so you're
going to end up in a world I think where
you're going to have a set of models
which are going to continue to improve
uh combining those models is going to
become you know disproportionately
advantageous you're going to have a set
of data inside your organization some of
which you're going to generate which is
fresh to be able to fine-tune those
models some of it which many
organizations already have which they're
going to use to ground the models in the
reality of their business and you're
going to need a set of capabilities that
allow you to bring those cap components
together as well as kind of manage the
generative AI applications and it's
those capabilities that we're kind of
focused on on building it across the
board uh at AWS a lot of stakes have
been put into what's going to happen in
the next 18 months in this gen world I
mean basically from my understanding
there's billions of dollars being put
into training these next set of models
all everything that you said definitely
implies but it's also just like there
are going to be companies that live and
die based off of their next iteration of
model so what do you think is a best
case scenario and what is a worst case
scenario for generative AI 18 months
from
now I I think that there will be a set
of model providers I think that there's
not going to be hundreds of world model
providers I think that there's likely to
be maybe a dozen two dozen something of
that order of magnitude uh I think that
you know anthropic will be one meta will
be one Amazon will be one there'll be
there'll be others uh but I don't think
there'll be hundreds of these of these
providers I think there'll be a small
number of providers and I think over
time they will offer a broader you see
this happening already a broader family
of models which offer different uh
opportunities for optimization so some
of those models will be hey the question
I am asking is incredibly valuable to my
organization I want to be able to pad it
with as much context from my private
repository as possible and I want the
best possible answer uh at any cost
that's how valuable that that query that
that prompt is to me I think there'll be
a lot of that I also think that you're
going to want to run you know a set of
less capable models at much much lower
cost and everything in between and so uh
my guess is that these models will not
you know kind of commodify uh my guess
is that they will diversify increasingly
over time and that uh the idea that
there's these models will become
Commodities defined as you know you can
hot swap them and that their economics
are Prim primarily driven by you know
supply and demand yeah I don't see that
happening and you can see the beginnings
of that now is you know providers like
anthropic are offering Claude 3 not as a
single model but as a model which has
you know the the sliders on its
configuration moved in slightly
different positions and offers three
different models within a within a uh
within a family I could see that
becoming 10 different models inside a
family with a more fine-tunable set of
of levers around cost and intelligence
and capability and latency those sorts
of things and so uh I think that
there'll be a larger number of models in
aggregate but that the pool of providers
probably won't grow much larger than a
dozen or two okay but what is the best
case scenario 18 months from now and
what is the worst case scenario 18
months from now oh the best case
scenario is exactly what I laid out like
that is the best case scenario scario
for customers that offers C customers
the broadest possible Choice it allows
them you know by proxy to be able to
address the broadest number of use cases
inside their organization and by proxy
derive the scale which will deliver
return on investment which is
commensurate with the value that they're
investing Cas it doesn't seem like that
you're anticipating in a best case
scenario models that will really be able
to like out like dramatically outperform
what we have
today uh no I think there will be I
think that if look at the differences
between um you know Claude 3 and Claude
3.5 you know the the way that you
measure the Improvement is going to
become increasingly nuanced okay and so
today there is a a in my opinion
misguided belief that you know the the
King of the Hill will will will
basically win there's going to be a
single winner here I don't think that's
going to be the case because there is so
much value in addressing all of these
different use cases and so uh the the
best performing model today is also has
a really great cost profile for the uh
for the for the intelligence that it
provides that was part of the invention
between Claude 3 and Claude 3.5 now over
time you know the intelligence will
continue to go up and there'll be
different optionality within the within
the Spectrum so the customers can can
find that sweet spot that's a very
interesting idea uh by the way as uh
Amazon put all four billion into
anthropic now I know there was a promise
that that was going to happen or an
upper bound
yep we've we've completed that
investment yep okay so then worst case
scenario what what do we like let's say
everything doesn't live up to
expectations like where when you when
you you must be game planning this out
like what do we end up with in the worst
case scenario I think the worst case
scenario is there's probably two pieces
one and this goes back to what we were
saying earlier I think um the worst case
scenario number one is we've just
mismatched where we're at the S where
we're at on the S curve and we're
actually in the top right hand corner
and the capabilities of the the core
technology the models the ability for
the models to be able to work with data
at scale the capabilities to be able to
merge those two things responsibly
together you know that that they don't
mature and improve at the pace that we
expect I think that would be that would
be a disappointing outcome I think it's
pretty low probability at this point
given given the trajectory that we're on
but that that could be one and the other
is again going back to something we
talked about earlier is that the the the
Readiness of
organizations slows down uh the the
opportunity to deliver on this
technology um because they are
struggling to manage the change or
they're struggling to you know really
drive reinvention through some of the
sort of you know cultural biases yeah
and so I could imagine that that is
playing out and I think that's at least
as large a challenge for most customers
is the is the way in which you structure
and organize and drive and deliver and
measure you know how exactly you're
going to to kind of operationalize from
a business perspective uh this this new
technology Discovery that's the worst
place not every company reinvents like
Amazon so it's going to be this is this
is true we are we are uni uniquely
designed for Speed which is uh which is
makes it an exciting place to work yeah
okay so on that note and I think we'll
come bring it home with this one uh
Amazon AI guy got to ask about Alexa um
MH I know it's a different division but
maybe there is some collaboration going
on
today everything I've heard about the
limitations of Alexa has been that
almost everything you do with Alexa has
been sort of hardcoded so it has like
Alex I'm sorry Alex you you you paus the
video feed paused I didn't I didn't hear
the the tea up there so you may want to
start again so almost everything I've
heard about Amazon Alexa has been that
the intelligence within Alexa is
effectively hard-coded in there that
there's like you know hundreds or
thousands of different queries that it's
prepared for and it will respond based
off of like a database that it pulls
from and there's been a question of
whether Amazon is going to move from
that style to a more large language
style powered Alexa that will require
effectively a rewrite and so I'm curious
if you think that the question is
grounded in fact and what's going to
happen inside the Alexa division of
Amazon well look uh Alexa is you know
you know an extremely successful
personal assistant um and has been well
received by customers you we have
hundreds of millions of Alexa end points
out there uh that customers love to use
uh what's really interesting about the
future of Alexa is that um part of the
success of Alexa has been that uh the
the way that Alexa works is that we're
very very accurate at matching the
intent of the user to uh actioning that
intent so that may be simple things like
telling a joke or getting the weather or
could be more serious things like smart
home use cases now some of those are
turning lights on and off but some of
them are locking and unlocking doors
setting burglar alarms and those sorts
of things and so it's really important a
really important capability of Alexa is
the ability to be able to form that
mapping that is very almost entirely
complementary to the kind of Revolution
that we're seeing with large language
models today which allow us to create a
much more natural much more fluid much
more human sounding much more intuitive
interface to those intents and so that's
what we're working on we're working on
Marrying the capability of this
remarkable uh ability to be able to pair
an intent to an action with the the
large language model interfaces that
have become very popular and allow us to
kind of unlock entirely new ways for
Alexa to to provide assistance for for
our customers and so I think it's a it's
a a complementary marriage between the
two technologies and you know we're hard
at work on that is there an llm in there
today Alexa has you know over a dozen uh
machine learning AI models under the
hood including you know large language
models and is that going to expand the
llm use cases within the
device yes part of what we're working on
is the ability to be able to take you
know more modern llms that have this
very natural easy intuitive back and
forth that's a really important part of
building an assistant and combining that
marrying it with the the the technical
underpinnings which allow us to do this
this intent mapping under the hood very
very accurately now what's funny the
reason is complimentary is llms today
are not very good at doing that llm you
know intent
they make mistakes you need to be able
to check them all those sorts of things
and so you know llms are good at you
know providing that natural language
that that very intuitive interface in
ways that is is better than Alexa
provides today and we want to take
advantage of that but Alexa today also
provides a lot of advantages that llms
are not good at doing today and so yeah
that's part of what we're do part of
what we're working on and so is it going
to require a full rewrite of the stuff
under the hood of the these
assistants uh no no because we want to
retain the core capability of Alexa
which is this intent action mapping okay
time frame for
that nothing to announce
today Matt Wood always great to speak
with you thanks for coming on the show
thanks Alex all right everybody thank
you so much for listening we'll be back
on Friday breaking down the news as
usual also Matt is about to uh hit the
stage at aws's uh New York Summit so I'm
sure you can find uh the news that he's
going to be making shortly after this
podast podcast hits all right thank you
so much for listening and we'll see you
next time on big technology podcast