Amazon's Longterm AI Vision — With AWS VP Matt Wood
Channel: Alex Kantrowitz
Published at: 2024-07-10
YouTube video id: JTcL6zgEK34
Source: https://www.youtube.com/watch?v=JTcL6zgEK34
the VP of AI products at Amazon web services joins us to discuss what people are actually building with the technology and whether it's worth the investment all that more is coming up right after this welcome to Big technology podcast a show for cool-headed nuance conversation of the tech world and Beyond well a year later we have Matt Wood back with us today he's the VP of AI products at Amazon web services last year we spoke at the AWS Summit in New York City all about Amazon's AI strategy and we have a great opportunity now to talk a little bit more about where the AI field is heading a year later looking back at what we discussed last year but also really where the field is at the moment and where it's heading Matt welcome back to the show great to see you good to see you too thanks for having me back this is uh this is awesome and congrats on the growth of the show it's been uh it's been amazing I listen I listen every week so it's a pleasure to be here thanks so much oh that's awesome uh so you'll have some context here so let me ask you I think the most pressing question that I have first which is we when we spoke last year you said I wouldn't be surprised if just the AI part of our cloud computing business was larger than the rest of AWS combined in a couple years so I'm actually curious where it is today but before we get into that here's the sort of Disconnect I have so obviously we spoke last year there was all this potential with AI we've been talking about it on the show a lot and yet I was just speaking with the colleague of yours who referenced this Gartner study that said only 21% of AI proof of Concepts so the different programs and products within companies actually go into production that's a one in five uh rate which is which is not great given how much effort and money it takes to get these things going um so talk a little bit about like where the potential and the state of the AI uh in uh AI I guess the state of AI building today why there's that disconnect and where we might be heading yeah happy to talk through it from from my perspective I've been uh very fortunate over the past uh year or two to talk to literally hundreds of customers in every single industry uh and I have honestly not seen this level of energy and enthusiasm uh for any technology probably since the Advent of the cloud uh from customers uh most customers are investing uh very diligently uh they're making good progress um there is a group uh which is moving slightly faster than the average which is somewhat counterintuitive uh and that group is actually the the regulated Industries and so it's folks like uh in financial services and insurance and health care and life sciences and Manufacturing uh and they're able to move a little bit faster in part because all the regulations that have been they've had to comply with over the past 20 years that probably felt at the time like a bit of a headwind have actually driven the right set of behaviors for that group uh to be successful with generative Ai and so they have you know all of the governance of their data figured out they understand the quality of their data they understand which data can be used where by whom and for what purpose uh they have very very large amounts of private text Data uh exhibites of the stuff in some cases uh which are market reports or clinical trial results or Insurance document life insurance documents those types of things that the models have never seen before but are really good at looking at and reading and summarizing and connecting the dots and finding disconnects uh and that just earlier in their kind of digital transformation journey and so they've probably looked across and felt like they were kind of sitting uh to the side as you know other areas like uh retail and uh transportation and hospitality and media kind of went through this very aggressive digital transformation over the past 10 years or so driven by the web and driven by mobile and a lot of other factors including the cloud uh and these organizations are looking to not just use generative AI to catch up but to actually Leap Frog ahead uh use the data that they have which is which is privately held so that that's one area that I think is um probably a little counterintuitive I don't think I would have guessed you know even a year ago or two years ago that uh a you know 160 year old life insurance companies would be in the Vanguard of really delivering value through generative AI right you they have these very very large document stores of um you know 90-year-old life insurance documents which probably going to pay out in the next decade or so and they've been scanned at some point point but no one's ever read them and they're not sure what level of risk is associated to those documents inside their business and so they're able to use generative AI to be able to you know piece that that risk together and understand it more more completely so I that you're going to say like the structure the companies with structured data um who have have it very organized and have this partnership already are going to be the ones that are going to benefit the most which is that makes sense but then there's also like some of the more glaring issues change management is difficult the model still cost too much to run and then um where they're not they're not quite good enough yet like last year and we're going to get to it but last year we were talking about agents and all these other you know Advanced use cases and they've clearly not hit the way that they are supposed to so what what do you think about these limitations aren't they the main things that are holding back the field versus just like getting their data in order well I think those uh there are limitations to the technology today and part of being successful with the technology is understanding those limitations at a deep level and you you referenced I'm not familiar with the with the details but you referenced kind of 20% of prototypes you know going into into production um honestly that sounds pretty good to me like if you think of just the amount of experimentation that is happening inside organizations around generative AI just the number of experments that are being run on AWS uh for different companies in the regulated Industries and all the other industries that I mentioned you know uh Bedrock which is the service that we make available to customers to build generative AI applications that's one of our fastest growing Services ever and all up Ai and machine learning at AWS is already a multi-billion Dollar business in terms of ARR so that there is a lot happening and I think that that 20% is actually pretty good because the denominator is absolutely massive and when technology shifts happen you really do want customers to be able to to innovate to be able to experiment really safely really quickly with that technology to find out what works and what doesn't work and we're dealing here with a technology which is uh just in its very earliest days it's much more like a a discovery than it is an invention uh we discovered that if you build these very sophisticated mathematical models that there is emergent Behavior within them that resembles reasoning that resembles intelligence and we're applying that in some places and some of those applications will turn out to be successful and it is no surprise to me at all that some of those experiments turn out not to be successful because if you're experimenting in the right way a lot of those experiments are going to fail and so it's why customers in part turn to AWS for running some of these uh some of these these workloads the majority of these workloads because they're able to broadly democratize the way that these applications are built using generative Ai and they're able to validate the ones that work really really quickly and then when they find that 20% that works Works they're able to take it into production uh very quickly and and a very very large scale with the right cost structure around it as well and so I think that the the 20% is a little uh misleading if you think the denominator is small but that denominator is massive because it's just so much experimentation happening and we see it on AWS and inside Amazon as well and look this is where I always kind of get tripped up because we talk about these you know these big uh emergent big things like emergent behaviors and models being able to do reason and how it's a discovery and then we talk about okay so what practically are they doing and it's like well they're coming through insurance documents you know shout out to all the folks working in insurance and I'm sure we have some listening to the show but I'm like man if we made this disc I mean if if people in the tech field made this discovery that models are intelligent and can think for themselves and like the thing that's that's like one side of this and then but then we ask when it's applied and it's like well Insurance adjustors are a little bit more efficient and it's like can that because we've the market has valued and the industry has sort of started building around these like Discovery and and uh use cases the reasoning the emergent behaviors but then when you ask practical and it's like the most boring applications you could possibly imagine so is that gonna change it it's interesting you say boring because boring workloads are boring because there's so freaking many of them they just everywhere and so yes I absolutely believe that there will be large step function changes in a significant number of industries that are going to drive you know uh uh orders of magnitude improvements for the organizations that work on them and Society at large one example is uh just computational biology uh and we can talk about that in more detail but the work that's going on there in terms of using generative AI are the likes of the Dana Farber Cancer Institute or genomics genomics England or visor or the work we've done with startup called evolutionary scale to be able to use generative AI to be able to design entirely to entirely new molecules to design entirely new antibodies uh that are manufacturable that can go on and find new drug targets like that is a major opportunity and step function it's early for sure the company just went out of stealth they just published their paper which is a great paper recommend everybody read it just for background on what's happening in that field but I absolutely believe that there will be many different step functions forward in multiple different industries of that of that format I also think that there is a huge number some of it's going to be longtail but just a huge number of you know what you call boring workloads that are going to be completely reimagined through the use of generative Ai and that's okay you actually want a lot of that boring work to be automated you want a lot of that work to be improved you want to be able to channel the boring work which has uh maybe inside some organizations is seen as a bit of a just as a cost center and to be able to turn that on its head and and channel it into something which drives invention and drives growth and this is exactly what we saw with cloud computing in the early days as well I I literally could have said that sentence in fact I think I did you with the Advent of of cloud computing that there is you know a huge number of workloads inside many many Enterprises that can take advantage of not just the cost Savings in the CL but can take advantage of the agility in the cloud and take something which is traditionally considered a cost Center building out data centers which offer no undifferentiated value and turn it on its head and drive the right cost structure and the right agility to be able to use that infrastructure to drive new uh uh product creation new invention and reimagination of all of these different products and so what we consider boring today is going to be rechanneled in my opinion into you know much more uh High leverage growth opportunities for many organizations and there's such a big change management component to it as well right we talk about the models right there's a cost there's a capability of the model uh but also you know one thing about trying to reimagine how boring work is done is there's a lot of people who are sort of used to that work um what percentage of the workplace do you think or the workforce do you think is really ready to like let's say this AI can can revolutionize the way they do work what percentage of the work Workforce do you think is ready to take advantage of it it's a good question um I'm not sure I would Peg it as kind of ready I suspect that uh whilst there will be these step function changes over the long period I think in the in the shorter term in the shorter Outlook it's going to feel a lot more incremental than we're probably used to uh there's a there's an old adage of um a story that folks tell that when we finally discover that there's life on another planet in another galaxy yeah we all have this idea that this will uh be a huge you know uh societal shifting event for the planet that we discover there's life on other on another planet but in reality I suspect there's just going to be lots and lots and lots of small iterative announcements and that when the NASA press release comes out that there's life on another planet it will seem really obvious at that point and it'll from from now to when that eventually happen Happ s yeah that's a really big jump but incrementally we'll get there incrementally not not in one big shift and I think the same thing will apply here there will be over a longterm like big incremental shifts in how we deliver products and how we deliver technology and how we interact with data and information and each other but it'll probably appear kind of incrementally and having patience and having a long-term view allows you to drive more of that value incrementally and allows you to experiment more and it allows you to uh have big goals and kind of you know iterate yourself to iterate your way to Greatness and having that long-term view I think is to get back to your question one of the most important cultural shifts that organizations will need to make you're going to need to have the right teams for sure you're going to need to have the right Talent you're going to need to have the right technology and you're going to need to partner with the right uh the right organizations to be able to drive that technology but having the ability to be able to take a long-term View so that you can allow those creative inventive Builders to be able to use that technology to be able to iterate and improve and experiment and invent like that requires discipline from a leadership perspective it allows uh it requires you to set up kind of small blast radius experiments uh and it requires the organizations to be very tolerant to that failure because experiments failing you've learned something there if you've set it up right and that learning is is dispropor valuable at this point in the uh the kind of Technology cycle and so that cultural element that you outlined is absolutely critical I'd actually say it's more like 50% technical 50% cultural in terms of the waiting of the uh elements of investment that are going to be required to be successful so I'm not sure exactly what percentage right now is kind of ready uh I would guess if I had to put a number on it I would say it's probably 25% 35% in most large-sized Enterprises um but over time you you know if you look 3 years out 5 years out 10 years out whatever it might be with that long-term Horizon my guess is it's going to be 100% yep okay I'm going to uh ask a follow up on that but first you believe in aliens uh I I think uh I think you have to believe in aliens if you understand just how big the universe is it just seems incredibly unlikely that we have hit the absolute Only Magical sweet spot in the whole Universe uh to encourage you know Carbon to be able to animate and dance around as we do as humans every day so the the probability of it just being limited to Earth seems very very unlikely although I acknowledge the Paradox that if there's life out there you know where is it uh so that's why I kind of like that yeah they could also all be be dead and we might be like right now like the only living I mean I think there probably are some sort of life forms out there that have existed did in the universe you know either before or will come after but to have them exist concurrently is that's the question I agree yeah yeah yeah go ahead I just want to get back to the AI stuff I guess we could do another show on Aliens um I would love that goodness all right so um you're what you're saying about patience incrementality uh you know 25% of the organizations being ready and replacing the boring stuff that all sounds good but it also makes me wonder if we're going to end up in a sort of trough of disillusionment with this technology because there's been so much hype and so much money that have poured into it that are demanding almost a revolution now and what you're describing isn't a revolution or isn't a quick moving Revolution it might be a slow moving incremental uh sea change but not something that happens immediately it's not something that um you know the Wall Street types for instance will be like thrilled to know that it's just going to take a while because they think in quarters so do you think we there's a risk here and in like within the next few years sort of the public uh perception of this technology turning a little bit because of the incremental incremental nature of it I I think it would be a possibility if and it's a huge if if the technology wasn't poised to improve so if if what we if you believe that what we have today is pretty much what we're going to have to work with with only incremental small improvements over the next 3 5 years you know then I suspect that you know folks will feel like you know the the the promise on this occasion hasn't been delivered on but you know technology tends to follow an scurve over time and you know you uh you get to the top right hand corner of that S curve and you end up with the technology with the capability and you get these you know just decreasing improvements over time time um you never really know where you're at on the scurve until you're looking backwards and so it's kind of hard to judge where we're at I think most people would think we're probably in that kind of middle section High gradient piece just because there's so much happening and there's so many yeah so many improvements there's new models and new techniques and new technologies from Academia and the public sector private sector um and I have no doubt that by the time we finish this conversation there'll be another technique out there that is worthy of our attention um but my guess is that we're it's probably more likely that we're at the bottom leftand corner I don't think we've hit the kind of hockey stick inflection point yet of what this uh what this technology is capable of it's still very very very early so uh at some point we're going to hit that hockey stick inflection point and it always happens with different technology shifts uh it can take uh more or less time depending on the shift and the speed of the technology if you look at you know the and the thing that triggers the s-curve band is different in a in a number of different ways so if you kind of look at the maturation of the internet itself you know that that hockey stick inflection point it really I think landed with the uh development of kind of SAS style Web 2.0 applications whether it was whether it was web mail or whether it was Finance systems whether it's hotel booking systems whatever it was that capability of being able to have access to those types of services that made uh and the fact that you could integrate those Services kind of through apis and do interesting things with them that meant that every new service that was added to the internet made all of the other services more valuable and that's kind of what pushes you up the S curve in many times you so the same thing with kind of the mobile transformation where we had these remarkable new devices we had these these these applications that more and more people invested in more and more organizations invested in they became more and more sophisticated and over time the operating systems on which those applications ran allow those intera those those applications to interoperate and interact in interesting ways both with the operating system and with each other so every net new application added makes all of them more more more useful makes the whole uh system more useful the whole device in your pocket gets better over time without you having to do anything and so that pushes you up the S curve as well and I don't think we're at that point with generative AI yet we have a really robust set of really interesting really powerful models which are going to mature over time but uh customers will I'm sure find interesting ways to combine those different models there there isn't one model to kind of rule them all each different model has you know different sweet spots and it's my expectation that most customers will invest in not building the foundation models but we'll invest in fine-tuning and improving those individual models and customizing them in interesting ways for their own use case and and they'll that those capabilities are interesting in isolation but part of what will push us up the scurve that we're seeing with customers at AWS and at Amazon is that combining those models together leaning Into The Sweet Spot of all these different models uh uh allows you to build systems that in aggregate are have a compounding effect on intelligence it's not additive it's a multiplier and so that's going to push us a little bit further up the S curve I think another really interesting area and the one that's probably closest to SAS applications and mobile apps is what you mentioned earlier is Agents I think agents have a good chance of being the apps for the generative AI world and the generative AI era and that as we add more of those and we find ways to orchestrate multiple agents together and there's already customers that are Building multi-agent Systems on AWS today that combine Specialties and combine agents that can goal seek on your behalf and collaborate or contest with each other in interesting ways that means that every new agent that's added to the system drives you up the scurve it makes all the other agents you know more useful at the same time without you having to do anything but are agents an actual thing in production now yeah I think so uh you we have um like I'm just I'm just going to say like last year we spoke about like you made you made an announcement about how agents were uh agent Building Technology was on its way and it's just like a full year has gone by and I haven't seen one example of like a of a realistic agent going out there and taking action for people uh well I think um I've certainly seen some uh I use some on on a day do these discussions yeah yeah for sure uh I i' I'd recommend you check out a couple things um uh that may be interesting to you and the the audience uh one is a startup company called ninj Tech you can check them out at ninj ninj tech. uh they have a uh a an assistant assistive system you can interact with natural language as you may be familiar with um but they also have under the hood a set of specialized agents that can perform different tasks on your behalf uh so they have a they have a researcher agent they have a scheduling agent they have a a web agent all sorts of different agents and just by asking your question they interpret the question and then they have an agent which looks at the response and says hey this looks like you're doing some research let me ask my researcher how I can best help you and I'll set a a problem to my research agent and that research agent runs off and does its thing and they say oh there may be some web data that will be useful I'll set my web agent off to go and collect that data for me and so on and so forth and it pulls back all the information together and allows you to interact with your with your calendar and with your schedule or your email or whatever it might be in levels which are much more uh automated than than you could do with just a standard assistive chat bot one example is so useful then why hasn't it broken out into the public yet I just think it's very early you know today uh agent systems or agentic systems as they're sometimes called um that it's still relatively early um but I think they are breaking out to be fair I think ninj Tech is seeing you remarkable growth they have hundreds of thousands of monthly active users we've also built some really um uh uh powerful and popular agents on AWS and so we have a a an assistant for Builders that we call Q Amazon q and Amazon Q allows you to uh uh generate code if you're building software uh it will take a question and give you answers and give you guidance on how to build on AWS and all the things you would expect and that's useful that gets you a bump in productivity uh we've seen some customers get you know in terms of just the amount of code that is automatically generated that they accept it's usually between 35 and 50% uh it's higher on Q than than any other uh comparable service but the thing that drives productivity for developers is what we call the developer agents inside q and so with a developer agents you don't just ask a question about what code to write or write a comment and get the function back you actually set a task to Q you say to Q hey I want to add this feature to my software uh his Q looks at the software across uh across your repository it looks at the changes that you've made inside your development environment it understands the type of change or the type of feature that you want to make and it goes off and it looks at all of that information and it makes a a strategy it doesn't just generate the code it makes a strategy for how to add that feature so so it picks which functions need to be updated which uh modules need to be added which tests need to be run which documentation needs to be added and you get a chance to review that strategy and at some point you can just say hey Q go for it and Q will go off and uh uh work through diligently through its uh through its to-do list uh to create a set of uh software changes that you can choose to commit uh which add that feature to your code and so if you can imagine a developer going from having to just write or generate that code manually to having you know tens or dozens or overtime hundreds of those developer agents running around doing the work on their behalf you get this you know combinatorial explosion of productivity we do the same thing for code transformation and so if you want to move between different versions of java yeah we support that today you just say hey update this to be compatible with Java 17 whatever you're running um it will go off and make that same strategy it will work diligently through it and then allow you to review the results and you can choose to accept those and commit them back and that's a that's a fixed cost effort that most organizations have to go through we need to move software project a from java X to Java y it's going to take 10 people it's going to take three months and we're just going to have to it's just a cost of doing business we're just going to have to pay that cost pay that cost in people pay that cost in productivity and this is a task that no developer really likes to do uh it's kind of toil work and the very best def boring stuff that we talked about it's boring exactly uh but it's super impactful because there's so much of it and so you move from a world where you know you have this fixed cost that you just have to pay a cost center just like we were talking about earlier and you move it to a point where that is just taken off the table it's completed automatically and those same developers get back to actually you know moving to do things which are much more productive instead of that work so we have we have Java to you named it uh Q because of Q from Star Trek not qinon right can we s it's it's neither it's neither but what was the inspiration for the Q uh it's uh based on a Quarter Master the idea of a Quarter Master where you get your gadgets okay you guys couldn't have picked a different letter it's a very controversial letter these days yeah I think it'll work out okay okay we're here with Matt Wood he's the VP of AI products at Amazon web services on the other side of the break we're going to talk a little bit about Amazon's products and also where the models are going next so stay tuned we'll be back right after this and we're back here on big technology podcast with mattwood the VP of AI products Amazon web services all right Matt so last year we were talking a little bit about Bedrock which is basically a tool that Amazon web services uh customers can use to develop with AI models um and the idea that you explained to me was basically Amazon's play for generative AI was that people could who want to develop on AI could go in and pick their own models uh through through bedrock uh it could be Facebook's Lambda or Amazon's proprietary models or any host of other models and then they could build uh that way but Bedrock has not integrated open ai's uh GPT models yet or Google's Gemini models yet and I was speaking with someone in the know who was basically like look like what they're offering is not really Choice it's like one model that works well which is anthropics and they're leaving out the other state-of-the-art models which is you know open ai's GPT 40 and then Gemini and ultimately that means that the offering is limited and in some ways behind and I'm curious what you think about that argument uh I would obviously disagree that it's behind I think um you know the the interesting thing about these models is that you know they can they can be very seductive uh when you look at a a model in isolation you know you can you can read the benchmarks and you know tribes are forming around these models and all those sorts of things but uh what we see time and again with u customers Enterprises startups who are actually building with this in uh in meaningful ways is that um they have a huge number of different workloads you I work with uh with some customers and they're very generous and they send me their their road map of all the things across the company that they want to be able to apply generative AI to and it's a spreadsheet of you know 5 600 rows of all the different things that they want to do with generative Ai and you know it's it's kind of um it's kind of intuitive if you if you play that out that there isn't going to be it seems very unlikely that there's going to be a single model that's going to be the best fit for all of those different workloads you know some of those different workloads have different requirements some have requirements that are you know heavy on reasoning or heavy on the ability to be able to do analysis others need to be really good at summarization others need to be really really fast others need to be very low cost and so there's there this this multiplicity of use cases that have different uh operational characteristics whether it is intelligence or latency or or cost whatever it might be and customers want to be able to usually map the model to the mission they want to be able to find the right model for their use case because if you have a small number of models or just a single model available to you it ends up having to play the role of kind of a Swiss army knife and a Swiss army knife sounds great it's great in a pinch but in reality you almost never want a Swiss army knife what you actually want is a broad tool B tool belt with all the specialized Tools in there that are a perfect fit for what you're trying to do if a contractor turned up at your home to do some Renovations and all they had was a Swiss army knife I think you'd be pretty disappointed with their preparation probably pretty disappointed with with their with their work with their work quality as well them home that's right exactly same thing with AI models you want to be able to match the right model to what it is that you're trying to do so you can lean into the advantage of that model in whatever it might be now some of those models you really do want you know as much intelligence and as much reasoning capability as possible and uh on Bedrock we make available the uh the anthropic models the particularly Claude 3 and the new clae 3.5 improvements which drive you know not just great A great experience for you know uh uh High Intelligence requirements but are the best performing models out there you know Haiku Claude 3.5 Haiku outperforms all other models on the planet and so uh that's that's great and you also want models which are really really specialized for a specific task and so we're making The evolutionary scale models that I talked about earlier they're available on AWS today and we're going to bring them to bedrock later this year we have summarization models we have models which are specifically tuned to build agentic systems we have models that are specifically tuned to work with reasoning we have other models that are just really really really really cheap we have models that are multimodal and will have handle different modalities we have single modality models we have large models we have small models and time and time again we have seen at AWS and this is an Insight that I think um uh maybe some other providers have not yet had uh um but because of our background in cloud computing we really recognize the value of optionality for customers every single time we have uh ventured into a new domain uh customers have time and again told us that they value the optionality of having purpose built Solutions like being model agnostic is definitely a crucial aspect devel basically you could swap being able to swap in any model uh I look at it I I I look at it more swapping models I don't think is is quite the same thing I look at it more like for each individual use case you want to find the right so if you well it's working well you know Bedrock is our one of our fastest growing Services ever uh we have tens of thousands of customers that are using it today it's growing growing growing like crazy and it's really based on this observation that we carried over from our cloud computing work you when we started when we launched ec2 uh which is our elastic compute Cloud it's our compute platform on AWS uh we launched with with just a single compute type in a single availability Zone just one that was it that's all you could use um but you know the goal was because we saw it internally at Amazon and customers very quickly told us that one single Choice was not what they needed uh and so we today we have over 400 different inance types but if this choice is working so well I want to ask you then there's a question I've been meaning to ask you for quite some time which is that maybe it's maybe it's Mo limitations of the models on the platform or maybe it's the evolution of the models uh but Amazon worked on something called uh Bloomberg I mean Bloomberg worked on something called Bloomberg GPT GPT on iws and this is from Ethan mik he's a professor at waren who studies this stuff he says remember Bloomberg GPT which was especially trained Finance llm drawing on all of Bloomberg's data made a bunch of firms decide to train their own models to reap the benefits of their special information and data here's what he says you may not have seen that gp4 the old pre-turbo version with a small context window without specialized Finance training or special tools beat it on almost all Finance tasks so I guess I'm curious from your perspective is it the fact that you didn't have the right models or that the models are Advan advancing so fast that something that could take that much effort to train through this process that makes a lot of sense could then eventually be surpassed by the next evolution of model from open AI well for context those two models were what 12 maybe 18 months apart something like that and today it looks like models have a a shelf life of probably about six months if you're training on kind of open open web data and it's partly why we like working with our friends at anthropic so much you know they are committed to continual and consistent Improvement of all of their different models and you know they launched the clber though then why would I develop this on you know bespoke model if I could be then surpassed by an off-the-shelf model well again I I suspect that I don't know for sure but I suspect that for General World Knowledge Questions you actually do want a model which is Tred on World Knowledge that's really really useful but that World Knowledge is is very very very broad uh but it's not particularly deep and most organizations operate at depth and so there will be questions for sure that you can pose to multiple different models and larger more modern world models I'm sure you can find examples that they will outperform specialized models and I am absolutely positive that the inverse is also true that you can find older smaller specialized models uh that will will offer much better higher quality lower hallucination results on specific tasks at the depth that most organizations need to read and so it's an and not an awe and so if you if you follow this train of thought where there is a single model that is going to quote unquote win I just think it's self-limiting because you'll always end up with that being the Swiss army knife that that presents the the denominator on your cas capability and that denominator will uh is not guaranteed to grow um in the depth that most organizations need to be able to operate in and so world and models are great they're super exciting uh you want them and you want the opportunity to specialize those models and fine-tune them you want to be able to build your own models you want to be able to take existing models and continue to train them you want to be able to layer in your existing data using retrieval augmentation you want to be able to adjust the alignment and style and tone of these models in interesting ways you want to be able to quantize those models if you want to run them at lower cost or on different environments so there's all sorts of value in optionality and all sorts of reasons why you might choose a different model and so it that is a really good example of where an and of having different models is a really good opportunity for for customers and you must have a good insight into like where the next level of models are going to go I mean being so close with anthropic ear to the ground um there's a lot of expectation that the next set the GPT 5S maybe the anthropic fours are going to have uh sort of I don't know Godlike capabilities that's what I like to refer to it on the show but like that's the anticipation what is the realistic expectation for what's coming next on the model front I think it's a good question I think we'll see a couple of different things um I think we'll continue to see uh improved reasoning capabilities uh the ability to be able to take in larger amounts of data reason across it with very very high accuracy you know to be able to answer increasingly complex questions to be able to apply logic to those questions we'll continue to see Improvement in there uh I think that uh I think that Improvement will come iteratively you know kind of every six months and probably much more quickly because different model providers are on slightly different schedule and so I think those will continue to to improve um I also think that um there is uh a undervalued asset in the fact that these models will continue to get better for sure but you also want to be able to you also want to be able to layer in your own data in order to be able to get the model grounded at the right level for your organization and so the world as we see things going forwards is that the models will continue to get better more capable more reasoning capabilities and specialization and customization of the systems built with those models will become increasingly important and there will be a sophist more sophisticated set of guard rails which are mediating what the models receive and what they generate on the outside and so you're going to end up in a world I think where you're going to have a set of models which are going to continue to improve uh combining those models is going to become you know disproportionately advantageous you're going to have a set of data inside your organization some of which you're going to generate which is fresh to be able to fine-tune those models some of it which many organizations already have which they're going to use to ground the models in the reality of their business and you're going to need a set of capabilities that allow you to bring those cap components together as well as kind of manage the generative AI applications and it's those capabilities that we're kind of focused on on building it across the board uh at AWS a lot of stakes have been put into what's going to happen in the next 18 months in this gen world I mean basically from my understanding there's billions of dollars being put into training these next set of models all everything that you said definitely implies but it's also just like there are going to be companies that live and die based off of their next iteration of model so what do you think is a best case scenario and what is a worst case scenario for generative AI 18 months from now I I think that there will be a set of model providers I think that there's not going to be hundreds of world model providers I think that there's likely to be maybe a dozen two dozen something of that order of magnitude uh I think that you know anthropic will be one meta will be one Amazon will be one there'll be there'll be others uh but I don't think there'll be hundreds of these of these providers I think there'll be a small number of providers and I think over time they will offer a broader you see this happening already a broader family of models which offer different uh opportunities for optimization so some of those models will be hey the question I am asking is incredibly valuable to my organization I want to be able to pad it with as much context from my private repository as possible and I want the best possible answer uh at any cost that's how valuable that that query that that prompt is to me I think there'll be a lot of that I also think that you're going to want to run you know a set of less capable models at much much lower cost and everything in between and so uh my guess is that these models will not you know kind of commodify uh my guess is that they will diversify increasingly over time and that uh the idea that there's these models will become Commodities defined as you know you can hot swap them and that their economics are Prim primarily driven by you know supply and demand yeah I don't see that happening and you can see the beginnings of that now is you know providers like anthropic are offering Claude 3 not as a single model but as a model which has you know the the sliders on its configuration moved in slightly different positions and offers three different models within a within a uh within a family I could see that becoming 10 different models inside a family with a more fine-tunable set of of levers around cost and intelligence and capability and latency those sorts of things and so uh I think that there'll be a larger number of models in aggregate but that the pool of providers probably won't grow much larger than a dozen or two okay but what is the best case scenario 18 months from now and what is the worst case scenario 18 months from now oh the best case scenario is exactly what I laid out like that is the best case scenario scario for customers that offers C customers the broadest possible Choice it allows them you know by proxy to be able to address the broadest number of use cases inside their organization and by proxy derive the scale which will deliver return on investment which is commensurate with the value that they're investing Cas it doesn't seem like that you're anticipating in a best case scenario models that will really be able to like out like dramatically outperform what we have today uh no I think there will be I think that if look at the differences between um you know Claude 3 and Claude 3.5 you know the the way that you measure the Improvement is going to become increasingly nuanced okay and so today there is a a in my opinion misguided belief that you know the the King of the Hill will will will basically win there's going to be a single winner here I don't think that's going to be the case because there is so much value in addressing all of these different use cases and so uh the the best performing model today is also has a really great cost profile for the uh for the for the intelligence that it provides that was part of the invention between Claude 3 and Claude 3.5 now over time you know the intelligence will continue to go up and there'll be different optionality within the within the Spectrum so the customers can can find that sweet spot that's a very interesting idea uh by the way as uh Amazon put all four billion into anthropic now I know there was a promise that that was going to happen or an upper bound yep we've we've completed that investment yep okay so then worst case scenario what what do we like let's say everything doesn't live up to expectations like where when you when you you must be game planning this out like what do we end up with in the worst case scenario I think the worst case scenario is there's probably two pieces one and this goes back to what we were saying earlier I think um the worst case scenario number one is we've just mismatched where we're at the S where we're at on the S curve and we're actually in the top right hand corner and the capabilities of the the core technology the models the ability for the models to be able to work with data at scale the capabilities to be able to merge those two things responsibly together you know that that they don't mature and improve at the pace that we expect I think that would be that would be a disappointing outcome I think it's pretty low probability at this point given given the trajectory that we're on but that that could be one and the other is again going back to something we talked about earlier is that the the the Readiness of organizations slows down uh the the opportunity to deliver on this technology um because they are struggling to manage the change or they're struggling to you know really drive reinvention through some of the sort of you know cultural biases yeah and so I could imagine that that is playing out and I think that's at least as large a challenge for most customers is the is the way in which you structure and organize and drive and deliver and measure you know how exactly you're going to to kind of operationalize from a business perspective uh this this new technology Discovery that's the worst place not every company reinvents like Amazon so it's going to be this is this is true we are we are uni uniquely designed for Speed which is uh which is makes it an exciting place to work yeah okay so on that note and I think we'll come bring it home with this one uh Amazon AI guy got to ask about Alexa um MH I know it's a different division but maybe there is some collaboration going on today everything I've heard about the limitations of Alexa has been that almost everything you do with Alexa has been sort of hardcoded so it has like Alex I'm sorry Alex you you you paus the video feed paused I didn't I didn't hear the the tea up there so you may want to start again so almost everything I've heard about Amazon Alexa has been that the intelligence within Alexa is effectively hard-coded in there that there's like you know hundreds or thousands of different queries that it's prepared for and it will respond based off of like a database that it pulls from and there's been a question of whether Amazon is going to move from that style to a more large language style powered Alexa that will require effectively a rewrite and so I'm curious if you think that the question is grounded in fact and what's going to happen inside the Alexa division of Amazon well look uh Alexa is you know you know an extremely successful personal assistant um and has been well received by customers you we have hundreds of millions of Alexa end points out there uh that customers love to use uh what's really interesting about the future of Alexa is that um part of the success of Alexa has been that uh the the way that Alexa works is that we're very very accurate at matching the intent of the user to uh actioning that intent so that may be simple things like telling a joke or getting the weather or could be more serious things like smart home use cases now some of those are turning lights on and off but some of them are locking and unlocking doors setting burglar alarms and those sorts of things and so it's really important a really important capability of Alexa is the ability to be able to form that mapping that is very almost entirely complementary to the kind of Revolution that we're seeing with large language models today which allow us to create a much more natural much more fluid much more human sounding much more intuitive interface to those intents and so that's what we're working on we're working on Marrying the capability of this remarkable uh ability to be able to pair an intent to an action with the the large language model interfaces that have become very popular and allow us to kind of unlock entirely new ways for Alexa to to provide assistance for for our customers and so I think it's a it's a a complementary marriage between the two technologies and you know we're hard at work on that is there an llm in there today Alexa has you know over a dozen uh machine learning AI models under the hood including you know large language models and is that going to expand the llm use cases within the device yes part of what we're working on is the ability to be able to take you know more modern llms that have this very natural easy intuitive back and forth that's a really important part of building an assistant and combining that marrying it with the the the technical underpinnings which allow us to do this this intent mapping under the hood very very accurately now what's funny the reason is complimentary is llms today are not very good at doing that llm you know intent they make mistakes you need to be able to check them all those sorts of things and so you know llms are good at you know providing that natural language that that very intuitive interface in ways that is is better than Alexa provides today and we want to take advantage of that but Alexa today also provides a lot of advantages that llms are not good at doing today and so yeah that's part of what we're do part of what we're working on and so is it going to require a full rewrite of the stuff under the hood of the these assistants uh no no because we want to retain the core capability of Alexa which is this intent action mapping okay time frame for that nothing to announce today Matt Wood always great to speak with you thanks for coming on the show thanks Alex all right everybody thank you so much for listening we'll be back on Friday breaking down the news as usual also Matt is about to uh hit the stage at aws's uh New York Summit so I'm sure you can find uh the news that he's going to be making shortly after this podast podcast hits all right thank you so much for listening and we'll see you next time on big technology podcast