Are 95% of Businesses Really Getting No Return on AI Investment? — With Aaron Levie
Channel: Alex Kantrowitz
Published at: 2025-09-18
YouTube video id: lWJ9getD9dg
Source: https://www.youtube.com/watch?v=lWJ9getD9dg
Why are the headlines telling us that businesses are getting no return on AI investment? And are AI agents finally ready to get to work? We'll cover it all with box CEO Aaron Levy right after this. Welcome to Big Technology Podcast, a show for coolheaded and nuanced conversation of the tech world and beyond. Well, today we're going to talk about AI and its application in business, whether it's actually making a difference, and whether AI agents are a real thing. We have the perfect guest to do it today because we have Aaron Levy back with us fresh off the Box uh Box Works AI event. And uh Aaron, it's great to see you as always. >> Thank you. Uh thank you, Alex. Uh good to uh good to be here. >> So, did I did I add Box uh Box Works AI event or is it just called Box Works? And I'm just >> I actually like I like you. I like you calling an AI event. Um it it is just called Box Works, but uh but any anytime you want to jam in AI in there, we're good. >> Okay, sounds good. You had a lot of AI news, we'll get into that in a moment. Uh but since you are talking with a lot of folks about AI applications, uh in business, I want to run this MIT study by you and get your perspective on what's real and what's not. So, uh this is from Axios. A couple weeks ago, MIT study on AI profits rattles tech investors. Wall Street's biggest fear was validated by a recent MIT study indicating that 95% of organizations studied gets zero get zero return on their AI investment. They studied 300 public AI initiatives trying to sus out uh the no hype reality on AI's impact on business. 95% of organizations said they found zero return despite despite enterprise investment of 30 billion to 40 billion uh into generative AI. This has been a study that everybody in the business world is talking about. Um, do you think there's any validity to it? You're already shaking your head. >> I I'm shaking my head on actually like seven dimensions. Um, we we can parse each one. Um, >> let's do it. >> Uh, so so um I mean actually maybe the first one that is maybe the most funny is is the the kind of Wall Street element. Actually, Wall Street is is completely schizophrenic on this dimension. um uh you know obviously a report like that you know scares them on one dimension but actually uh there's an equal amount of of kind of wall street maybe um you know kind of uh frenetic energy around the idea that that AI will be so good that all software is dead. So, so it's a it's this very kind of bipolar state of, you know, where are we in AI adoption versus versus AI is going to be so powerful that there's not even going to be software business models because everything will just be delivered by AI. Um, and and you know, as with most things that that have these kind of, you know, extreme polarization elements, I think the the reality is just, you know, way more nuanced. Um, we are still early in the adoption curve of AI. uh in the early you know curve of all of these types of technologies you have lots and lots of proof of concepts you have lots of trials of different technologies people are trying to figure out which which tool works for which use case. So by definition you're kind of in the wild west where there's lots of attempts uh at trying these technologies with various vendors and and technology stacks and and many of those projects and pilots will absolutely fail because by definition they're pilots um and they're and we're still in the early phases. One interesting thing about the study was uh they they saw a significant delta between companies that tried to effectively DIY uh their AI stack versus um you know going with really kind of applied solutions and use cases. And this is what we tend to find in our customer base. So um you know I think there was maybe an initial theory of well AI will be relatively easy to kind of get our arms around. Um we could build our own AI application. We'll do all of the vector embeddings of our data ourselves. We'll put it into a vector database. We'll have we'll manage the security and permissions of data access ourselves. And you know, before you know it, a company that wanted to deploy AI in in a particular workflow in their enterprise, they might have 10 or 15 different pieces of software that they have to run and manage just before, you know, before a single user could actually interact with with with AI within that organization. So that's that's probably an architecture that's not going to work. Um, you need to you need to have purpose-built solutions that that solve sort of tailored use cases. Those can be very big use cases like all of AI coding, but you probably don't want to be in a position where you have to kind of bootstrap this or or build it all out yourselves. Um, and that was one of the the kind of recognitions in the survey. But, uh, I I obviously wholeheartedly, you know, disagree with with any of the maybe um, uh, conclusions uh, other than other than just you have to get your use cases right. Uh, you have to, you know, kind of, you know, target um, uh, the most effective areas for AI and you probably shouldn't be building this technology yourselves. Um and uh and so the the but but it it's sort of empirical on our end. We we get to talk to customers every single day that that are are seeing the immediate gains. Um we've talked to customers where uh where they have uh had had um colleagues that that can't actually uh they can't present the actual ROI savings to their board. um the actual kind of uh uh expected Rway savings to the board because the board won't believe uh how how they won't believe the numbers um based on how good they are. So they actually have to water them down. Um so it's actually more pragmatic and believable based on what they're seeing. So >> isn't that a terrible board? I mean if the board can't hear the truth a new board, >> but the truth is so good that that it's it doesn't sound incredible. So, so that that that is the uh like when when the the ROI is so good that you actually don't you you aren't going to be believed when you actually explain how this thing is going to work. So, we're we're seeing examples all across the board at least for our customers. You know, we have we have the benefit of a very applied use case, which is we we take documents and unstructured data and then we have AI agents that can operate on that data to do things like extract structured data from your documents. So, give us, you know, 100,000 contracts and we'll pull out the structured data fields in those contracts. Uh, or give us invoices and we'll pull out the key details in an invoice so we can help automate a workflow. Those use cases tend to be very high ROI because either you weren't getting that data before or it used to be very expensive to do so. And AI is getting increasingly good at being able to execute that kind of task. Uh, and so there's immediate benefit to customers. You can automate workflows much more easily. As a result, you can lower the cost um of operations in some areas. So, so we we tend to see a different set of outcomes um based on the the AI adoption that within our customer base, but but you know, if you zoom out and you kind of think about all projects across, you know, the past couple of years, I do think you're going to get a mixed bag just as a reality of how early we are in the space. >> Yeah. And it says internal builds uh fail at double the rate of external partnerships. So, spot on there. people trying to parse this together on their own versus doing it externally are having a tough time which sort of flies in the face of like some of the conventional wisdom. I think the conventional wisdom was you wanted to be able to build internally maybe with open source so you could customize to your use case but it turns out some of the off-the-shelf stuff is actually working quite well. Yeah, I think you have to you you know a lot of the the challenge with with either these types of surveys or even talking about architectures is you have to kind of separate uh the tech industry from the non- tech industry. Um the non- tech industry being the kind of consumers of these types of technologies and the tech industry being the builders. So open source is insanely valuable but not in the sense where a law firm should go off and build their own AI project using an open source model like that. that is just a recipe for disaster if if you know we think that that every single company on the planet is going to go build their own technology to go automate their workflows and that that has been actually the case for a lot of pilots because we've been early in the technology and you haven't had applied solutions that you could go deploy but open source is actually extremely valuable for a company like Box because it it you know we're we're you know we're powering technology for 120,000 customers and so we actually do have the expertise internally to leverage those kinds of capabilities And so so I I would say I would say the conclusion from from you know the dimension of of open source as an example is just is you know you probably shouldn't expect that every company on the planet is going to DIY their own AI strategy and that's a recipe for not getting the returns and gains uh from an AI adoption standpoint. And then maybe the final point thing I' I'd kind of point out is just there really is a decent amount of change management required to getting real gains from AI. There's not a this is not a panacea type of of of solution where you could take an existing workflow, drop AI directly into it, and then all of a sudden that workflow will be, you know, 3x better. You usually do have to re-engineer the work to take advantage of AI. And the the conclusion I've recently come to more and more is, you know, I think we had this feeling maybe two or three years ago where AI was going to learn everything about how we work. who would be able to adapt to our workflows and then bring automation to our workflows. And I think realistically, increasingly, we probably will have to modify our work uh hopefully incrementally, but in some cases meaningfully to fully take advantage of AI. And that sounds maybe hard on one hand, but for the companies that do that, the ROI is going to be fairly massive. Um, so if you think about AI coding as as maybe the the, you know, most obvious example right now where you're seeing productivity gains, the way that AI kind of first engineers tend to work is pretty different than how you engineered two or three years ago. The engineer really becomes more of a manager. Uh, you're deploying agents to go off and work on large parts of the codebase and then it's coming back with a a bunch of work that you go and review. So if you don't change your workflow as an engineer to take advantage of background agents and how you give them the right kinds of prompts to actually execute on their task and the new ways you should effectively think about your codebase and um you know handling the specifications and and you know rules of what the AI agent should do. If you don't do all of that work, you're probably not going to get a 2x or 5x gain from from AI. And so we will actually have to re-engineer some of our business processes to make agents effective as opposed to thinking agents will just drop into our processes and automate everything that we're doing. >> By the way, you've brought up pilots a couple times and I think it's important to talk about because this study was not just pilots. It was 95% of organizations uh get zero return on AI investment. So I think the pilot thing is interesting because it's natural that pilots are going to fail. And in fact, we've had some listeners who've given me some feedback that said because I talk often about how like only 20% of AI pilots or 10 to 20% of AI pilots get out the door into production. And that might be a good number. Uh because you're you're going to obviously, you know, have some trial and error in the early days. >> Yeah. And and to be clear, I I'm using pilots colloially in the sense that we're just so early in the technology that that when we talk to customers, what a lot of times they have so far deployed is the equivalent of a pilot. Um just because of literally how >> organizationwide. >> Yes. Well well organizationwide is is you know it's hard for one centralized survey taker to represent an organizationwide. It's like like the that's why again that's why I don't want to like the survey is great. It's an interesting, you know, kind of conversation starter, but like if you actually tried to go assess how is the answer, you know, answering this question and what is their way of measuring that that productivity and have they actually surveyed all the end users that are just using chatbt in an unsanctioned way and what they're doing. It's like it's not possible to capture all of that. So, so it tends to more represent the the kind of the the centralized, you know, heavily sort of, you know, again, kind of I think more more likely pilot oriented type projects because of just again how early we are. Um, you know, the word agents just came onto the scene less than a year ago. So, we're just early in a lot of these spaces. But again, I I think it's a fantastic survey because it gets a conversation going. But I think if the takeaway was to slow down um you know using AI or or to uh to do anything other than kind of realize what you should mitigate from a risk standpoint, then actually the the the failure would just be or the problem with the with that would just be um all it's going to do is cause some companies to to move even more slowly and then you'll have other companies just outrun them. So, so it's kind of up to the, you know, it's sort of, you know, uh, at the risk of, um, you know, you know, the the risk is now on the listener to decide what they want to do about that that survey. Yeah. And I can tell you one more thing that I found super interesting about this study, which has sort of been underappreciated. So, it says official LLM purchases cover only 40% of firms, yet 90% of employees use personal AI daily, at least those surveyed, which just is so interesting because it means that yeah, there's there's more personal use and more interest among individuals than companies to get this stuff uh into production. Yeah, you obviously have reaction here, so let's hear it. Yeah. >> Well, I know. I just think that's that's like empirical revealed preference. So, so like you don't have to like you don't even have to survey once you know that. why are people, you know, going off and and using AI in in a personal productivity sense um at that rate. It's because they're they're getting value from it. So, you almost like that that is sort of now in the baseline of of how people are working. um uh it it's unquestionable that if you just sort of eliminated a AI just today, let's just say um you you would just notice, wow, okay, I I actually have to go and do that three hours of research that I used to be able to go and kick off as a deep research project and go and check back in on it, you know, after 5 minutes. And so, so, you know, it it's empirical that we're choosing to use these technologies on a daily basis because they're they're adding that productivity. I mean, I would argue that that what we've seen with AI thus far is barely scratching the surface of what is going to start to happen as you you start to deploy these technologies. But do you think the use in business could it potentially be just individuals using let's say chat GPT on their own versus uh scaled enterprise use of large language models or because or or or do you think will be some blend >> in the future? >> You're obviously watching in the future because you're obviously watching this happen on on the other side of things. No, the the future is um I I think that that we are in the earliest phases of just even the diffusion of the the technology itself of of the the basic use cases of hey when you're going to go research a customer you know why not why don't you get a full account plan um uh you know instead of just saying okay this person works at this company and they're interested in these things and this is these are the trends in that industry why not ask a an AI system to to gen generate the full And that that's super powerful but also relatively basic if you think about about how people work um and the full scope of workflows that people do. Um, one one really interesting example of of again how early we are, uh, Claude this week announced uh, a new capability that will generate files for you. And even though we're two and a half years, you know, nearly 3 years into the chatbt moment, it's the first time where an AI system can, I believe, generate reliably a a kind of high quality document in the form of a, you know, word document or PowerPoint presentation. So we're nearly three years in and it's the first time ever that you could generate something that you would sort of look at and say, "Oh, that looks like a good presentation." So, so we are only at the very very beginning stages. Now imagine it'll stay still take a couple years. Now imagine a technology like that begins to ripple through uh corporations and in the future um before you go and present whatever product you're selling to a customer instead of spending one or two hours of doing a bunch of research and making your PowerPoint file that's your presentation you go to an AI agent you say I'm about to go sell to this customer you know generate this presentation for me you kick that off and again three minutes later it's sort of done for you that this is going to just show up in all of our workflows every single day in in almost everything that we're doing. So coders are getting the first lens into what the future looks like um you know earliest because you know they're they're sort of wired to take advantage of these tools and AI coding has been the kind of first breakout use case. But that same dynamic of you're going to go to an interface you're going to talk to an agent it's going to go and execute kind of multiple steps of work for you that will start to emerge within you know all of knowledge work over the coming years. I actually am am probably a pragmatist on this sense that it will not be like this instant overnight you know transformation of work. It will take years of change management. Um we just hosted our conference this week as you as you noted and um it happens to be a a crowd obviously by definition that is sort of forwardleaning and and kind of early adopters of technology but that represents a small fraction of the total economy. It will take years before again all of the banks, all of the pharma companies, all of the law firms start to get wired up. um in this uh in this AI first way but um but I I mean unequivocally it's going to happen um and there's there's nothing that will kind of slow that train down. >> All right, let's talk a little bit more about this using cloud to generate documents uh use case. I mean I would imagine so you talk the example that you gave was using one of these to go in and sell into a client. Now I would imagine most organizations they have like their PowerPoint templates and the data baked in. So, even if I were to go into Claude and like upload um my pricing spreadsheet, my inventory spreadsheet, uh a document about positioning and say make a PowerPoint based off of this, I'm sure it would do a good job. But how practical is it to then say this is going to be a way that people do their work versus uh something that might look like a party trick where you're going to use the other documents that you have already when you actually are going to go out into market. >> Oh, yeah. I know the way that this will actually show up and and I I you know I can't represent the exact date that this will happen but Box you you'll just go to Box and you'll say here's my sales presentation template here's the new client information please generate a PowerPoint presentation with that like and then you'll just do that with your existing data this is not the this is not sort of you know some kind of one-off vibecoded document you you will use your existing assets as the source material for the next document that you'll generate and you'll go and review its work and that'll take you three minutes, but it will have saved you, you know, an hour or two hours of all of the time that it took to do the customer research and and move around all the graphics and put the relevant information in place. That will just be done for you. So, so and that will, you know, multiply that over a million people that do that per day and in, you know, some sector of the economy and you'll just see, you know, that's how you'll get tens of millions of hours of productivity gained, um, you know, within the within the economy. And how are you feeling about the trustworthiness of these models? Because you've talked a couple times now about how you could use deep research to prepare you for something or you could use these models to generate a PowerPoint and then spend a couple minutes checking them over. Are you at the point now where you think these and know the outputs of these models are trustworthy enough that that's all it takes? >> I think as long as uh and this is where I get very excited about about now obviously what's in the zeitgeist is context engineering. as long as you are really good about what context you're giving the AI and and how you are are effectively grounding the the AI in uh trustworthy data with the right kinds of prompts and a and a high enough quality model um you could nearly eradicate uh the you know all of if not the vast majority of of hallucinations or accuracy issues. So in our case, you know, everything that we do at Box is we think about your existing data as the source material for for the AI agent. So it's the source context for the AI agent to be effective. And so if I take an existing PowerPoint document that's our sales presentation and I say modify this for a new customer uh and and you do that with a you know a frontier model that is a a reasoning model you know with some degree of kind of thinking mode uh I I would I would posit that 99% of the time it's going to make you know infantismally false small kind of errors or failures on that that's just like a a solved problem at this point and so um and and it is still easily worth the the kind of fiveminute trade-off off for the couple hours you save to go and review its work. And this is the that like we we actually have this incredible front row seat in watching what the future looks like with coding. So if you talk to if you talk to the new like the brand new startups um and I I I don't know if you if you do this but I know that that you know you get you get you know to spend your time with the devices of the world and whatnot but like go talk to a fivep person startup that's brand new and what's exciting is they are working in the craziest ways that I've ever seen in my entire life. Um I was talking to a nineperson startup the other day that estimates that they're at a minimum executing at the size of about a 100 person company and that that was again kind of conservative probably when when you you know do the underlying math and it's because each of their engineers has the capacity output now of five or 10 or 20 engineers worth of work but they are working in a completely different way they are they are managers of AI agents they spend their time on writing really good specs for what they want to build they spend really good time on on the design architecture of their software and then they spend a lot of time on reviewing the output of the agent. So, you know, not every area of knowledge work will look exactly like that. But if you imagine, you know, in sales, if you imagine in marketing, if you imagine in legal work, uh, and your role is to manage agents that are doing a lot of the underlying data preparation, research, um, you know, creation type of work, and then your job is to go review that work and put it together in a broader business process. That will actually be what a lot of work looks like in the future. And this idea of hallucinations or errors will be no different than the fact that I have to sometimes review other people's work and other people review my work and I have errors in the presentations that I create that somebody catches and they see a misspelling or they they see that I change the name of a customer in the wrong way and they they change that. We will be doing that for AI agents. So it's this it's this flip of the model where we thought AI agents were going to review our work and kind of incrementally make us more productive. We will be the reviewers of the AI agents work. We will be the editors. We will be the managers. We'll be the orchestrators. And that's actually how you then get the productivity gains. So I'd say watch the AI coding space. Watch what startups are doing to get leverage and then think about that against the broader economy. >> You know, it's really interesting, Aaron, because the last time we spoke you told me about this person that you knew who was basically building a company on their own using AI coding tools. And so I was in the process of writing this profile of Dario at at Anthropic which you're quoted in and uh I went out and found a developer doing something quite similar using cloud code to build on their own. So >> this is clearly I mean to the point where like anthropic now has to put some rate limits on but this is clearly a thing that's happening. >> Well and and this is the this is the thing that again I I still I love the MIT survey. I think it's great. It's it's it's a fun conversation topic, but but the the the the one uh travesty would be if if people miss that that what you just said is actually happening on the ground and then not starting to pay attention to what that's going to mean as that ripples through corporations and how people should probably start to think about re-engineering workflows for a world of of AI agents and and you know this happens in every single technology uh uh you know wave which is actually why you have early adopters and early innovators and why you have lack lagards is because the early adopters and innovators are going to read, you know, your anthropic piece and see, oh, this actually is a real trend. And the lagards are going to read the MIT piece saying, "Oh, I've been vindicated." And some companies will then get those early returns at a much faster rate. And other companies can wait. And and you know sometimes that means that your company gets disrupted and sometimes it doesn't because you actually have you know some proprietary you know um capability as an organization like like if Fizer or um Eli Liy took a little bit longer to adopt AI as a result of of you know one to be more pragmatic that'll be totally fine. They're not going to get disruptive like they have enough of market position they have enough distribution they can afford to kind of wait for this technology to be more baked. But if I'm a startup right now, I'm probably going to use that as my advantage as much as possible to try and run circles around maybe a larger incumbent. And this is what kind of creates this nice tension in the market that that you know creates creative destruction um in every in every you know kind of wave of of uh of technological change. >> Okay. I definitely want to speak a little bit more about what the definition of an agent is and how you're rolling them out at Box and also uh get your reaction about uh GPT5. So let's do that right after this. And we're back here on Big Technology Podcast with Box CEO Aaron Levy. Uh Aaron, let me start before we get into agents and before we get into uh GPT5, let me just start with a basic question um which is if this is already happening in business, which is basically like you're finding ways to get the AI to do work on its own and pull information from different data sources and present it coherently. Why do you think it's been so difficult for uh consumer companies like let's say Amazon with Alexa plus and Apple with Apple intelligence to put this together some as something um on device or a consumer product that does similar activities because they've all promised it but it it's not quite there yet. >> Yeah, I I think there's um the the fact that the technology can exist is different from uh the the the still the execution requirements to to bring it to life. Um, and so, you know, we we get to all have a front row seat on what the frontier models can do and and you have companies that can package those up in a way for these applied use cases. Um, but if you're a company with tens of millions or hundreds of millions of of users of your your product and consumers that have a certain expectation and um you that that that is a lot of execution gap required to go from the frontier model to how do you deliver that to your end customer in a reliable way that is trustworthy, that is affordable. Um and uh and so I I think that the bigger companies are all going through their own version of that motion. Um I'd also imagine that given a given the space is moving so fast uh I can sympathize for probably some degree of indecision maybe where one day a model is on top and then the next day a different model's on top and then another day you know another model kind of breaks through. And so you probably want to make sure that by the time that you land on a final architecture you want that to be the sustainable long-term architecture. And so to some extent time is on your side up to a point because you might want to wait to see kind of who falls out and who keeps going. Um but but I I'm I I think that you know the as an example the companies you just mentioned like I don't think the the the spaces have been so utterly uh disrupted that that uh that they can't catch up uh once they land on a final architecture. Um but you know we'll have to see kind of how they execute through this. And so for business, it's more that there are more prescribed use cases. And I think with with a phone, maybe if you're trying to get these proactive notifications, then that you're looking at a massive universe of data, whereas you're more concentrated in business. Or what's the difference? >> Well, actually, I I I wouldn't say there's a difference. I would say even in business, we're insanely early. Like we we have to process how early we are. The the the the breakouts so far have been chatbt for consumers. The breakouts have been uh you know coding agents for very very uh wired in engineers that that are you know very online. They're paying attention to everything going on and then early adopters across the economy. You know most of the the agents that are being deployed in the enterprise are being done by the the like um maybe you can flash it up or something. Jeffrey Moore came up with this idea of the technology adoption curve or at least popularized it. It has multiple categories of where a company and a or a group of individuals will will be. You have you have these early innovators and early adopters. Then you have a chasm. Then you have then you basically have kind of pragmatists and early majority and then you have lagards and and we are in the early adopter kind of the earliest phase of jumping over the chasm on some use cases. But we have to imagine there's there's this chasm where what happens is the early adopters, the people that, you know, we all hang out with and and talk to all day. They're going to try everything. We're going to try the these crazy goggles and we're going to put, you know, magnets on our head and we're going to do the craziest things. We're going to wear Google Glass. And that that actually tells you almost nothing about whether the thing will jump over the chasm. you you have to actually see like what what makes it to the early majority or the or or those pragmatists that that really adopt things at scale. And so the kind of technologies that have clearly broken through are chatbt products like cursor products like let's say you know a bunch of these kind of nextgen research agent type things perplexity done well in that kind of early majority but we we we are so early in terms of AI agents jumping over now the chasm. So some won't make it some will. Um but but I would say that that business is not particularly moving faster than than the examples you just gave. I just think we we can see lots of examples of it, but they're usually in that kind of early adopter type type category, >> right? And so the week we're talking, you at Box are releasing a number of different agents. Um, but let me start this discussion by just asking you, what is an agent? Because >> it does seem like it's an overused term and and even myself, who I'm I'm in this all the time, I I don't fully have clarity on what that word actually means. >> Um, I I think the uh I think we should anticipate that it's fully overused. It is now the new term of art for talking to a an AI system that is doing work for you. So just we will hear this will be the main term that we use going forward as an industry and not because it's a buzz word but actually it's a it's a useful term. It's a it's a definable object that is doing automated work for you. That could be in some cases as simple as answering a question. Um, but I think most people in in the tech industry would generally argue that it should be doing some degree of of work and looping through the AI model multiple times um uh to do that work. And so uh that could be everything from you know very clearly something like claude code or cursor has an agent or replet has an agent where you give it a task like build me a website that has these qualities and it will go off and do you know weeks worth of human work in 10 minutes and that's an agent that is managing that whole process looping through the model multiple times keeping track of what it's doing updating its memory in the process and that's effectively an agent. So that's an agent encoding and we're going to see that same kind of agent architecture emerge in law, in healthcare and finance and education where you can deploy agents to go off and do work for you. And um and and there will be, you know, a critical access which is how much work can the agent do before you have to intervene and modify and kind of repoint it in the right direction. And so a lot of that work right now can be maybe a couple minutes long, but but we're seeing examples where agents could be running for tens of minutes or maybe even hours and effectively drive, you know, better and better and high quality more high quality output. So, so I think that that's a way to think about agents and and these are going to be very pervasive in the in the coming years, but this is really the first year 2025 is the first year where we could even really be talking about it seriously. Um uh and um I think Andre, you know, Carvathy had a had a um uh you know, probably phrased it as we shouldn't think about this as the year of agents. We should think about it as the decade of agents. Um that's probably the right way to think about it. This is >> the year of mobile became the decade of mobile. But then eventually we started using mobile. >> Yeah. But but but the and and again when you just said the year of mobile mattered, right? Did did people say, you know, so some people said that was in 2022, but probably the first time it could have been realistic was 20 uh sorry, not 2022, 2002. Some people uh but but it wasn't really realistic until 2006 and 2007 when when you had the iPhone. So uh you know I and I think fairly other many many other people are actually convinced we we already have our iPhone for agents. We don't need we don't need any kind of new breakthrough architecture. We have the the an architecture that that already kind of works as the the core scaffolding for agents. So, so we can start the decade kind of clock now. Um, but it will be a a full self-driving type problem. Um, you know, obviously Whimo, you know, got kicked off, I don't know, a decade, decade and a half ago and only this year is it, you know, accessible in suburban Silicon Valley. So what what took a decade or a decade and a half uh it was just lots of engineering work, lots of miles on the road, lots of improving every single dimension of um you know of of of the accuracy of uh and the intelligence of of the system. We're going to see the same thing for knowledge work. It's going to take years. The early adopters will get the the early returns. The pragmatists will will use it once once it sort of works without a lot of handholding and everybody will land somewhere in in that in the middle of that spectrum. Okay. And so I watched a chunk of your presentation this week and some of the agents that you're talking about enabling companies to deploy will be things that will for instance uh take a look at a application to um be invol to maybe take an apartment out or um uh oh yeah or to look at some property records and then do tasks there or to create reports um looking at clinical tests and trying to pull out issues. So talk a little bit about how the process to create these works and is this still in the demo uh phase or is this actually real? >> Um so maybe uh second question first. Um so uh so so we we made a number of big announcements this week. Um uh some of the the product and capabilities that we announced are fully GA right now. So customers can already start to use it. Some of it we we kind of give a little bit of a of a crystal ball view into the next couple of quarters of of the product that we're getting out there. Uh as an example, we have an AI agent right now that any customer can go and use um uh which is an a data extraction agent. So you can give us again contracts or invoices or medical data. And then we have an a AI agent that that um that works through that content, pulls out the critical data from those documents and then lets you go and automate a workflow around that. What we announced at Box Works was a new capability called Box Automate. And what the idea of Box Automate is is it's very very powerful to have one-off agents that can help you, you know, review a document or generate a proposal or generate a sales plan for a client based on data. That's super powerful. But what's even more powerful is if I can drop many of those agents into a full business process. So what Box Automate lets you do is actually define your business process within Box. It could be a client onboarding workflow. It could be an M&A due diligence review process. It could be a health core a healthcare patient uh review process. And you define that workflow within box automate. It's a drag and drop um kind of workflow builder. And then at any point in the process, you can bring in an AI agent to do work within that process. And so one one thing that that that is very important with AI agents is they need the right context to be effective. So our system allows you to get that context to agents from your enterprise content. So your marketing assets, your research data, your contracts, your invoices that becomes very important context for agents. So Box Automate lets you basically build these agents on demand or on the fly in a in a workflow that leverages your existing content and then we can start to help you automate a bunch of knowledge work tasks um around the enterprise. Now, a lot of the early reviews around GPT5 was it was sort of built to do these type of things or like as a foundational layer for this type of work, right? The the reviews we read early on was that um it just does stuff and there have been people that have noticed that like when you're in chat GPT using GPT5, you like literally can't have an answer where it doesn't say can I do something uh for you. So, I'm actually curious what your response has been. We last time we spoke was preGPT5. uh what your what your feeling has been about this new set of models really it's a set of models um and and I'm curious like what you make of the fact that so many people were disappointed early on >> well um yeah so so um we on on the on the disappointment or kind of online zeitgeist which actually interestingly has already shifted um uh I think you know quite a bit where a lot of folks have kind of updated their views on on GBD5 and I think codeex has come out very strong recently on the coding uh agentic side um uh you know I if I think it uh we have gotten uh used to uh and and we've been hooked on these incredible kind of jumps and and breakthroughs over the past you know year or so we had we went from if you think about it we went from GPD4 to GPD40 to 01 and 03 and then GBD41 and each of those on a different axis was actually a pretty meaningful step function. So if you had just taken GPD4 and then you jumped to GPD5, it would have looked insanely exponential. But we got these points along the way that that effectively uh you know kind of gave us an early preview into what GPD5 would ultimately become which is a thinking model with chain of thought with with a a way higher quality of coding skills and a bunch of capability de on u capabilities on on critical dimensions of work. And so so I think it was mostly just driven by the fact that we got lots of incremental steps uh or step function steps on the path to GBD5 and then GBD5 was just the culmination of a lot of those breakthroughs. Um so so again it's I think it's probably more psychological than than than you know kind of empirical like I think if we had gone from you know three to four to five it would be the most vertical axis we've ever seen. Um but it was really again those steps along the way that that um that maybe caused a little bit of that that kind of reaction. Uh in our world, you know, we we test every single model on a number of of evaluations where we give um we give the model different types of enterprise data uh contracts, financial documents, research materials, internal memos, those types of things. And we ask the model a series of questions about that document or data. And we saw meaningful improvements from GPD5 versus GBD41 as an example on our our eval. So for us it was multiple points of improvement on a number of our key on our key tests. And that those those improvements then translate into real life improvements for for for you know customers where uh they all of a sudden will it'll mean that when uh you're a health care provider using GPD5 on unstructured healthcare data you're going to get better results than you got before or when you're using it on your contracts you're going to get better results and so on on a number of spaces where either it was kind of expert uh analysis required in healthcare or law or financial services we saw improvements or in more a general sense if you needed logic or reasoning or math uh it was also an improvement on those dimensions as well. >> Can I get a quick gut check from you on the economics of the AI industry right now? I mean we are talking at a moment where we just talked about this on the Friday show with Ranjan that uh OpenAI's losses are now going to total 115 billion through 2029. Oh sorry it's cash burn 115 billion through 2029 80 billion higher than it previously expected. It's It's expected to make like 10 billion this year, but it just signed a $300 billion deal with Oracle that like turned Oracle into a nearly $1 trillion company almost overnight and made Larry Ellison the richest person uh in the world above Elon Musk. How does this how does this make sense? Well, I I think it makes sense if if you believe like I do and certainly others, you know, Jensen, you know, Kulie Sam, uh even Elon, I think would believe that this is the single biggest technology uh uh that that we've we've probably ever had access to. And um and so if you you think about this as sort of a third industrial revolution where for the first time ever, we can bring automation to knowledge work. Just think about that for a second. We were bringing automation to knowledge work. Everything about the world of knowledge work was always basically limited by how fast we as humans could work. We could type into a computer, put data into a system, somebody else reads that data, it moves along in some kind of process. That was about the speed of of knowledge work was how quickly we could type or read information and then do something in the real world with that data. That was the rate of of pace. That was the pace that knowledge work could happen at. And so every field that we know of uh in in kind of knowledge work uh you know health care uh experts reading you know medical diagnosis um uh uh life sciences experts that are doing research on clinical studies. Lawyers that are trying to find facts about a a case um or uh you know working through intellectual property. An engineer trying to generate code and read product specifications. All of that work has always been constrained by how fast we as individuals can do that work individually ourselves. For the first time ever with AI, we can bring automation to effectively all of that work. And that automation can kind of be tuned based on just how much compute we throw at the problem. And then of course how good our data is and how how effective our systems are at getting that data to the AI. But in a world where you can toggle compute and then get different levels of automation and and effective output in work to get done at a way lower cost than what people can do. That that is the biggest breakthrough we've ever had in in you know in the economy and in in the sort of you know in the kind of post-industrial uh you know world. And so, you know, hundred billion dollars of of loss, let's say, to to get to that point um of of, you know, saturation where that technology is out there. That that's a very it's actually a very small number when you think about the economy and the size of the economy for all of healthcare, all of law, all of life sciences, all of financial services, all of engineering. So, I think that's how that's how these technology companies are underwriting this. And the losses are a choice to be clear like that. I mean that's very that's very obvious like they are choosing to lose that money. They're doing it uh for a strategic reason you know that that's at least their decision. The strategic reason is is that this is such a valuable market to own and to dominate in that that they would rather build up capacity and in many cases subsidize usage let's say in free consumer tiers of chache BT than charge everything at today's you know kind of rate of cost and and then you know make sure everything is profitable that's a choice they could decide to charge for everything they would get less adoption today they would it would be you know instantly a more sustainable business But enough people believe that the prize is big enough that it's worth actually doing all of the research expenses, all of the data center expenses, and the subsidization where necessary to drive that adoption and demand. And it's a go big or go home type of bet. You know, clearly very very smart uh very economically rational uh firms, individuals, sovereign wealth funds believe that that bet is worth it. um I'm probably on the side that that the bet is worth it because of again how how material of an economic impact this technology can have and then we'll obviously house how how we'll see how it plays out with any kind of individual player in the in the space. Folks, you can learn more about Box's offerings at box.com. There's a video playing on the homepage right now that talks a lot more about the things that Aaron and I have discussed here today. Aaron, so great to see you. Thanks again for coming on the show. >> Thanks, Alex. All right, everybody. Thank you so much for watching. We'll see you next time on Big Technology Podcast.