NVIDIA VP Rev Lebaredian Talks Plan To Build AI That Understands The Real World
Channel: Alex Kantrowitz
Published at: 2025-02-05
YouTube video id: n_A1Nf7mjjA
Source: https://www.youtube.com/watch?v=n_A1Nf7mjjA
Let's talk about Nvidia's push to generate AI that understands the real world with technology that can influence the future of robotics, labor, cars, [music] Hollywood, and more. We're joined by the company's VP of Omniverse and simulation technologies right after this. Welcome to Big Technology podcast, a show for cool-headed nuanced conversation of the tech world and beyond. Today, we're joined by Rev Lebaredian. He's the vice president of Omniverse and simulation technology at Nvidia for a fascinating conversation about what may well be the next stage of AI progress, the pursuit of world models that provide common sense to AIs. Rev, I'm so happy to see you here. We actually spent [music] some time at your headquarters a couple months back and I'm really glad that you're here today and to introduce you to the [music] Big Technology audience. Welcome to the show. Thank you for having me. All right, before we jump into world models, obviously, we're having this conversation in the wake of the deep seek revolution. I don't know what you want to call it. And everyone is talking about Nvidia now. You're in quiet period, so we're not going to go into financials, but I can and do want to ask you about the technology side of this, specifically about Jevons paradox. I keep hearing Nvidia, Jevons paradox, Jevons paradox, Nvidia. What is Jevons paradox and what do you think about it? Uh my understanding of what Jevons paradox is is essentially an economic kind of principle that as you um reduce the cost of something of running it, you create more demand for it because it unlocks essentially more uses of of that technology when it becomes more economically feasible to use that. Uh I think I think that really does apply in this case in the same way that it applies to almost every other important computing innovation uh over the last 40 50 years or at least as long as I've been alive. Um you know, the inception of Nvidia in 1993, uh Nvidia selected uh very carefully selected the very first computing problem to address um in order to to create the conditions by which we could continue innovating and grow keep growing that market. And this was the the problem of computer graphics and particularly rendering within computer graphics, generating the these these images. Uh the reason we selected it is it's because it's an endless problem. No matter no matter how much compute you throw at it, no matter how much innovation we throw at it, you always want more. And throughout the time I've been at Nvidia, which is now 23 years, you know, many times I've heard, "Well, graphics are good enough. Rendering is good enough. And so, um soon soon Nvidia's big GPUs and more computing power is not going to be necessary. We'll just get consumed by SOCs or integrated into another chip as integrated graphics and it'll disappear." But that never happened because uh the fundamental problem of simulating the how the physics of light and matter was endless. Uh we see this in almost every important computing domain. AI is one of these things. I mean, can we really say that that um we have now reached the point where our computers are intelligent enough or the intelligence we create is good enough and so it's just going to shrink. We're not going to have any more use for more compute power there? I don't think so. I think intelligence is something that is probably the most endless of all all computing problems. If we can throw more compute at the problem, we can make more intelligence and do it better and better. So, making making AI more efficient will just increase its economic value uh in in many many the applications we want to apply it to and increase demand. And can we talk about the progression of AI models becoming more efficient? I know it's like a hot topic right now, but it does seem to me that over the past couple years we've definitely seen models become more and more efficient. So, what can you tell us about this what just talk about large language models on this front. Um the efficiency gains that we've seen over time with them. I mean, this is this isn't new. This has been happening uh for the past 10 12 years or so, essentially since we first um discovered deep learning on our GP GPUs with AlexNet. Um if you look at the the uh computational curve, what our GPUs can do um in terms of uh tensor operations, the AI kind of math that we need to do, over the last 10 years, we've had essentially a million X performance increase. And that increase isn't isn't just from the raw hardware. It's also through through many layers of the software, algorithms. So, we're getting these these benefits, these speedups continuously at a very rapid rate, exponentially, by compounding many um many layers at all the different layers at which this computing happens, from the fundamental hardware, the chips themselves, at systems level, networking, system software, algorithms, frameworks, and so on. Um so, what what we've seen here with with deep seek is is a great advancement that's on that same curve that we've been on for for a decade now. Okay, and 23 years at Nvidia. I'm going to save a question to ask you about that as we get later on or towards the end of the interview cuz I'm very curious what your experience has been being at Nvidia for so long, especially given that, you know, the company's technology um at least from the outside world was viewed as in favor and then people questioned it and back in favor, people questioned. Obviously, we see what's going on now. Maybe we're living through a mini cycle at this point. So, very curious about your experience, but I want to talk about the technology first. And let me just bring you into the conversation that we had here on the show with Yann LeCun, who's Meta's chief AI scientist, really right after ChatGPT came out. And one of the things that Yann did was he said, "Go ask ChatGPT what happens if you let go of a piece of paper with one with your left hand." And I typed it in, it gave a very convincing answer. It was completely wrong because with text, you don't have the common sense about physics. And try as you might to teach a model physics with text, you can't. There's just not enough literature that describes what happens when you drop a paper with a hand and therefore the models are limited. And Yann's point here was basically like, "If you want to get to truly intelligent machines, you need a build something into the AI that teaches common sense, that teaches physics, and you need to look beyond words to do that." And so, now I turn it over to you, Rev, because I do think that right now within Nvidia a big initiative is to build a picture of the world to teach AI models that common sense that Yann had mentioned was lacking. And I have some follow-ups about it, but I want to hear first a little bit about what you're doing and whether your efforts are geared towards solving the problem that Yann brought up. Well, what Yann said is absolutely true. Uh and it it makes intuitive sense, right? If a an AI has only been trained on words, on on text that that we've digitized, how can it possibly know about concepts from our physical world, like what the color red really is? Or what what it means to hear sound. What it what it means to to feel uh fells. You know, it can't it can't know those things because it never experienced them. When when we train a model, essentially what we're doing is we're providing life experience to that model and it's and it's pulling apart patterns or it's discerning patterns from all of the experience that we give it. And what's what was really really amazing about uh GPT, the advancements with LLMs, you know, starting with the transformer, is that we could take um this really really complex um set of rules that humans had no way of actually uh defining directly in a in a clear and robust manner, the rules of language, and we were able to pull that out of a corpus of data. We we took all of this text, all the all these books, and and whatever information we could scrape from the internet about that. And somehow this model figured out what all the patterns of language it are in many different languages and could then, because it understand the fundamental rules of language, do some amazing things. It could generate new text, it could style some text that you give it in a different way. It can translate text from one form to another, from one language to another. It can do do all of this awesome stuff. Um but it lacks any information about our world other than what's been described in those words. And so, the next step is the next step in AI is for us to take the same fundamental technology we have, this machine we have, where we can feed it life experience and it figures out what the patterns and the rules are and feed it with uh actual data about our physical world and about how our world works so that it could apply that same learning to the rules of physics in instead of the rules of grammar, the rules of language. It's going to understand how how how the physical world around us works. And our uh thesis is that from all the AIs we're going to create into the future, the most valuable ones are going to be the ones that can interact with our physical world. The world that we experience around us, the world created out of atoms. Today, the AIs that we're creating are largely about our world of knowledge, our world of information, ones and zeros, things that you could easily represent inside a computer in the in the digital world. Uh but but uh if we can apply the same AI technology to the physical world around us then essentially we unlock robotics. We can have these agents uh with with these with this intelligence and even super intelligence uh in specific tasks do amazing things in the world around us, which is um if you look at global markets um if you look at uh all of the commerce happening in in the world and GDP um the world of knowledge, information technology is somewhere between two to five trillion dollars a year. But everything else, transportation, manufacturing, supply chain, warehouse and logistics, uh creating drugs, all the stuff in the physical world, that's about a hundred trillion dollars. So, the application of of of this kind of AI to the physical world uh is going to bring more value to us. So, it's interesting. It's not just basically inputting that real-world knowledge into LLMs, right? So, they can get the question about dropping the paper with a hand correct. It is also something that you're working on is building the foundation for robots to go out into our world and operate within it. So, yes, it's not it's not inputting it in the same way that we do for these text models. We're not just going to describe uh with words how what happens when you drop a piece of paper. We're going to give these models uh other senses during the learning process. So, they'll they'll watch um watch videos of of [snorts] paper dropping. We can also give it uh more more accurate specific information in the 3D realm uh because we can simulate these physical worlds inside a computer today. We have physics simulations of worlds. We can pull ground truth data about about the position and orientation and and uh state of things inside that 3D world and use that as another mode of input into these models. And so, what we'll end up with is a a world foundation model that was trained on many different modes of data, essentially different sense senses. It can see, it can hear, it can um touch and feel and do do many of the things we can do or many things other animals or or even things no no creature can do cuz we can provide it with sensors that don't exist uh inside inside the natural world. And uh it can from that kind of decipher what are the actual combined rules of of of the world. And this this um encoding of the knowledge of how the physical world works can then be the basis for us to build agents inside the real world, to build the brains of these agents, otherwise known as physical robots. Right. And so, this is your recently announced Cosmos project. So, talk a little bit about like what Cosmos is. I mean, obviously it's a world foundational model, but uh where how long you've been building it and what type of companies are and developers might use it and what they might use it for. Um we've been we've been working towards Cosmos uh for probably about 10 years. We envisioned that eventually uh this new technology that had formed with deep learning that that was going to uh be the the the uh critical technology necessary for us to create robot brains. And that that that was is ultimately what's going to unlock this incredible amount of value for us. So, we started working towards this a long time ago. Uh we realized early on that the big problem we were going to have is in order to train such a model to train uh a robot brain to understand the physical world and to to work within it we're going to have to give it experience. We're going to have to give it the data that represents the physical world and capturing this data from the real world is is not really uh an easy thing to do. It's very expensive and in some cases very dangerous. For example, for self-driving cars uh which is a type of robot. It's a robot that can autonomously, you know, on its own figure out how to get from point A to point B by controlling this physical being, a uh a a car, by braking and accelerating and steering. How are we going to ensure that a a self-driving car really understands when a child runs into the street as it's barreling down the street that it should stop? And how can we be sure that it's actually going to do that without actually doing that in the real world? We don't want to go capture data of a child running uh across the street. Well, we can do that by simulating it inside the the inside a computer. And so, we realized this early on. So, we set about applying all of the technologies we'd been working on up into that point with computer graphics and for video games and video game engines and physics inside these worlds to create a system to do uh world simulation that was physically accurate so that we could then train these AIs. And so, we call that um uh operating system, if you will, Omniverse. Uh it's a system to create these uh uh physics simulations, which we then used to train AIs that we could test in that same simulation before we put them out in the real world. So, we use it for self-driving cars and other other robots out there. So, building Cosmos actually starts first with simulating the world. And so, we've been building that stack and those computers for quite a while. Um once uh once the transformer model was introduced and we started seeing the the amazing things large language models can do and the ChatGPT moment came um we understood that this had essentially unlocked uh the one thing that we needed to really push forward in robotics, which is the ability uh to to have this kind of general intelligence about a really complex set of things, complex set of rules. And so, so we set about um building what is Cosmos today essentially a few years ago using using all of the technology we had built before for with simulation and AI training. And what Cosmos is is uh it's actually a few things. It's a collection of uh some open weight models uh that that were that we made freely available. uh Uh along with it, we also provide um uh essentially all of the tooling and pipelines you need to create a new world foundation model. So, we give you the world foundation models that we've started training which are world-class, especially for the purposes of building physical AI. And we also have a uh what's called a tokenizer, which are AIs themselves uh that are world-class. It's a critical element of uh of of building world foundation models. And then we have uh curation pipelines. The data that you you select and curate to feed into the training of your world foundation model is critical. And just selecting the right data requires a lot of AI in it of itself. So, we released all of this stuff and we put it out there in in the open so that um the whole community can join us in in building physical AI. And so, who's going to use it? Is it going to be robotics developers? Is it going to be somebody that's building, let say LM based application, but just wants them to be a little smarter? Both? It will be all of them. Yes, it's uh we we feel that we're as a as as the industry, the world is right at the beginnings of this physical [snorts] AI revolution, and no one company, no one organization is going to be able to build everything that's that that we need. So so we're building it out there in the open uh to encourage others to come build on top of what we've built and come build it with us. And this is going to be uh essentially anybody that has an application that involves the physical world. And so that's definitely robotics companies are part of this and and robotics in the very general sense. That includes self-driving car companies, robo-taxi companies, and as well as uh companies building robots that are in our factories and warehouses. Anybody that wants to make intelligent robots that have perception and operate autonomously inside the real world, they want this. But um it's not it's not only about robots in the way we think about them as as these agents that move around. Uh we have sensors that we're placing in our spaces in in in our cities, in urban environments, inside buildings. Uh these sensors uh need to understand what's happening in that world. Maybe for security reasons, for for coordinating um other robots, changing the climate and and uh energy efficiency of of our buildings and data centers. So there's there's many applications of physical AI that are um broader than what we generally think of as as these what what you imagine when you say a robotic application. There's going to be thousands and thousands of companies that that build these physical AIs, and um this is just the beginning. Now, you mentioned that the transformer model was an important development on this path. And that obviously was the thing that underpinned a lot of the real innovation we've seen with large language models. Can the real-world AI learn from the knowledge base that has been sort of turned into these AI models with text? Like if you're if you have a model that's trying to understand the world with common sense, do they take text as an input? They take all of it as input. How does it work then with with text? I mean, it's very interesting because it seems like that's like when we talk about the progression towards general intelligence, that is a very, you know, kind of amazing application of being able to read something and then sort of intuit what it means in a physical space. Don't you think? Yeah, I think um the the way I think about it, and I think this is right, is these AIs learn the same way we do. Uh when when you're brought into this world, you don't know uh who is mommy, who is daddy. You don't even know how to see yet. You don't have depth perception. You can't see color or understand what it is. You don't know language. You don't know these things. But you learn by being bombarded with all of this information simultaneously through the many different senses. So when when um your mommy looks at you and says, "I'm mommy." pointing, you're getting multiple modes of information coming coming at you, including essentially that text that's coming through in audio form there. Um and then eventually when you learn how to read, um you you learn how to read because a teacher points at letters and then words and sounds them out. So you have this association that you build between the information that um uh uh you understand, like like mommy, and the letters that mean that thing. AIs learn in the same way. When we train them, if you give them all of these modes of information associated with each other at the same time, it'll it'll associate them together. That that's how image generators work today. When you go generate an image using a text prompt, uh the reason why it can generate, you know, a an image of a red ball in a in a uh uh grass field. Uh uh in an overcast day is because um when it was trained, there was an association of some text along with the images that were fed into it. It knew that during the training process that the that these these um words were related to that image, and so we can we can gather um uh that understanding from from that association. What we what we're trying to do with world foundation models is take it to the next level by giving it more modes of information and richer information, but part of that will still include text. We'll we'll feed in the text along with um uh with the video and and other ground truth information from the physical state of the world. Yeah, so this is going to be a multi-part question, and I apologize, but um I don't really know another way to ask it. So what are the the other modes of information that you're feeding in there? And do you really need to go through this simulation process? And I'll tell you, you know, it it all it it all sounds like a worthwhile endeavor to me, and I'm sure it is. But I also see video models today, and that is something that's really surprised me when we've seen the video generation models, is that they really have an understanding of physics. Like they know, just as an image like an image generation is not moving, right? So you know that let's say the guy sits on the chair. But video, you could see people walking through a field, and you watch the grass move. And that means that those models inherently have a concept of how physics works, I think. And I'm going to run it by you cuz you're the expert here. But like again, and Jan's going to come on the show in a couple weeks, so maybe this is just in my mind because um I'm gearing up and and thinking about our last conversation, but I'm going to put this to you also. Maybe I'll ask uh what what your answers I'll ask him to weigh in on your answers on this. But the thing that he always talked about is a human mind is able to sort of see infinite possibilities and accept that. It doesn't break us. So if you have a pencil, and you hold it up, you know it's going to fall, but you know it could fall in infinite possibility in infinite ways, but it's still going to fall. For an AI that's been trained on different scenarios, it's very difficult for them to understand that that pencil might fall in infinite ways when asked to generate it. However, they've been doing a very good job with the video generators of like showing that they understand that. So uh just to sort of reiterate, what different modes of information are you using, and why do we need this broader simulation environment or this cosmos uh tool if we're getting such good results from video generation already? Uh all very very good questions. So first off, we use many many modes. Uh the primary one though for training cosmos is video. Uh just like the video generation models. But along with that, there's text. We also feed it um extra information and labels that we can gather from um uh data, particularly when we do when we train when we generate the data synthetically. If you use a simulator to generate the videos, you have perfect information about everything that's going on in every pixel in that video. We know how far each object is in each pic pixel. We know the depth. We know um uh what the object is in each pixel. You can segment out uh all of that stuff. Traditionally, um what we've done uh for perception training for autonomous vehicles, so we've used humans to go through and label all that information from hours and hours of of video that's been collected, and it's inaccurate and not um not complete. So so from simulation, we can get perfect information about the actual videos themselves. Now, that being said, your your question about you know, these video models seem to really know physics and do uh and know it know it well. Uh I I think it is pretty amazing, you know, how much physics they do know. Um and and it's kind of surprising we're here at this point. Like had you asked me 5 years ago, would we be able to generate videos with this uh this much physics plausibility at this stage? I wasn't sure actually, cuz I continually had been wrong for years prior to that. I didn't expect to see image classification in my lifetime uh until we saw it with AlexNet. Um but but I I would have bet against it. And so so we're pretty far along. That being said, there's a lot of flaws in the physics we see. So you see this in the video, one of the one of the basic things is object permanence. If uh you direct the video to move the camera point away and come back, Objects that were there at the beginning of the video are no longer there or they're different, right? And so, that is such a fundamental violation of the laws of physics, um it's kind of hard to say, well, these models currently understand physics well. Uh and there's a whole bunch of other things in there. Um you know, my my um life's work has been primarily computer graphics and specifically rendering, which is a uh 3D rendering is essentially a physics simulation. It's the simulation of how light interacts with matter and eventually reaches a sensor of some sort. Like, we simulate what a camera would do in a 3D world and and and what image uh uh it would it would gather from the world. Um when I look at a lot of these videos that are generated, I see tons and tons of flaws because when we do those simulations and rendering, uh we're attuned to seeing when shadows are wrong and reflections are wrong and and and these sorts of things. The to the untrained eye, it looks plausible. It looks it looks correct, but I think people can still kind of feel something is wrong, you know, when when it's AI generated, when it's not in the same way that for for decades now since we introduced computer graphics to visual effects in the movies, you know, when some you don't you don't know what it is, but but if the the rendering's not great in there, it just feels CG, it feels wrong. There we still have that kind of uncanny valley thing going on. That all being said, I think we're going to rapidly get better and better. So, so the the models today have um have an amazing amount of knowledge about the physical world, but they're maybe at like 5 10% of what they should understand. We need to get them to 90 95%. Right. Yeah, I just saw a video of a tidal wave hitting some island and I looked at it was like super realistic. Of course, it was on Instagram because that's all Instagram is right now is 3D generated, I mean, AI generated video and it took me a second and it's more frequently taking me a minute to be like, oh, that's AI generated. And sometimes I have to look in the comments and just sort of trust the wisdom of the crowds on that front. But you might you might not be the best judge uh of it as well. Humans, I mean, we're not particularly good at knowing uh whether physics will really be accurate or not. This is why movies, you know, directors can take such license with uh with the physics when they do explosions and and all kinds of other fun stuff like tidal waves in there. Yeah. Well, it's it's interesting like uh I had some comedian made this joke. They're like, uh Neil deGrasse Tyson likes to come out after these movies like Gravity and talk about how they're like scientifically incorrect and uh some comedian's like, yeah, well, how about the fact that George Clooney and Sandra Bullock are the astronauts? That didn't bother you at all? But it is interesting that we can watch these videos, watch these movies and fully believe, at least in the moment, that they're real. Like, we can allow ourselves to like sort of lose ourselves in the moment. Exactly. And just be like, yep, I'm I'm in this story. I feel emotion right now watching, you know, George Clooney in a spaceship, even though I know he's no astronaut. And I think for that purpose, I mean, I worked on movies. Before I was at Nvidia, that's that's what I did, computer graphics for visual effects. Um that is a perfectly legitimate use of that uh technology. It's just that that level of simulation is is not sufficient for building physical AI that are that are going to be the underpinnings or the fundamental components of a robot brain. I don't want my my self-driving car or my robot operating heavy machinery in a in um in a factory to be trained on physics that's not that doesn't match the real world. Even if it looks right to us, if if if it's not right, then it's not going to behave correctly and and that's that's dangerous. So, so it's a it's a different purpose. That's why what we're doing with Cosmos, uh it's it it is really a different class of AI than video generators. You can use it to generate videos, but the purpose is different. It's not about generating beautiful imagery or interesting imagery as for art. This is about simulating the physical world using AI to to uh create the the simulation. Rev, I want to ask you uh one more follow-up question about not the flaws, but the video generator's ability to get things right. And then we're going to move on from this topic, but it is just surprising and interesting for me to hear you and Demis Hassabis, the CEO of Google DeepMind, who was just on, who commented on this, talk about how these video generators have been surprisingly good at understanding physics and Jan also basically in our conversations previously effectively saying that it's very difficult for AI to solve these problems. I won't say they've solved it, but everybody's surprised they've gotten to this point. So, what is your best understanding of how they've been, though flawed, like this good? You know, this is the uh uh trillion-dollar question, I guess. You know, we've been we've been betting now for years that if we just throw more compute and more data at at the problem, that these scaling laws are going to give us a level of intelligence uh that's really, really meaningful. That that will be like step function changes in in uh capabilities. There's no way for us to know for sure. It's very hard to predict that. It feels like we're on an ex- we are on an exponential curve with this, but which um uh part of the exponential curve we're on, we we can't tell. So, we don't know how fast that's going to happen. Honestly, uh I'm I've been surprised at how how well these transformer models have been able to extract the laws of physics at this to this level by this point in time. Uh I have at at this point, I believe in a few years, we're going to get to a level of a physics understanding with our AIs that are that's going to unlock, you know, the majority of the applications we need we need to apply them in in robotics. Let me ask you one one more question about this, then we're going to take a break and talk about some of the societal implications of putting robotics, let's say, in the workforce and in I don't know, in all different areas of our lives. There's definitely a sizable portion of the population that is going to be surprised, maybe not our listeners, but a sizable portion of the population that would be surprised to hear that Nvidia itself is building these foundation these world foundational models, releasing weights to help others build on top of them. The perception, I think, from uh someone on the outside is, hey, isn't Nvidia just the company that makes those chips? So, what do you say to that, Rev? Well, yeah, that's that's been the perception. It's been the perception since I started at Nvidia 23 years ago and it's never been true that we just build chips. Chips are very, very important part of what we do. Uh they're the foundation that we build on. But when I joined the company, there were about a thousand people, thousand employees at the time. The grand majority of them are were engineers. Just like today, the majority of our employees are engineers. And the majority of those engineers are software engineers. I myself am a software engineer. I I I wouldn't know the first thing about making a chip. And so, our form of computing, um accelerated computing, the form of computing we invented, is a full stack problem. It's not just a chip. Uh it's not just a chip that we throw over the fence and leave it to others to figure out how to make use of it. It doesn't work unless we have these layers of software and these layers of software um have to have algorithms that that are harmonized with the architecture of our of our chips and our systems. Uh so, we we have to uh go in these new markets that we enter, what Jensen calls zero billion-dollar industries, we have to actually go invent uh these new things kind of top to bottom cuz they don't exist yet and nobody else is going to likely to do it. Um so, we build a lot of software and we build uh a lot of AI these days because that's what's necessary in order to build the computers to power all of this stuff. We did this um with LLMs early on. Uh many, many years ago, we trained the at the time what was the largest model in terms of number of parameters for an LLM, it was called Megatron. And because we did that, we build our computers, our uh uh chips and and computers and the system software and the the frameworks and and uh pipelines and everything uh uh we we were able to tune them to do these large-scale things and we put all of that all of that software out there which was then used to create all the LLMs we enjoy today. Had we had not done that, I don't think we would have had chat GPT. And so so this is essentially the same thing. Uh we're we're uh creating a new market a new capability that doesn't exist. We see uh this as being an endeavor that is greater than Nvidia. We need many many others to participate in this. But there are some things that we're uniquely positioned to contribute given our scale and our particular expertise. And so we're going to go do that and then we're going to make that freely available to others so they can build on it. Yeah, for those wondering why Nvidia has such a hold in the market right now, I think you you just heard the response. So I do want to take a break and then I want to talk about the implications for society when we have let's say humanoid robots doing labor in that part of the economy that we simply you know haven't really put AI into yet and what it means when it's many more trillions of dollars than the knowledge work. So we're going to do that when we're back right after this. And we're back here on Big Technology podcast with Rev Lebaredian. He's the vice president of Omniverse and simulation technology at Nvidia. Rev, I want to just ask you the question that obviously has been bouncing around my mind since we started talking about the fact that you're going to enable robotics to be able to sort of take over. I don't know. Is take over the right word? Take over a lot of what we do currently in the workforce. I mean what do you think the labor implications are here because yeah, if you're if you've spent your entire life you know working at a certain manual task and next thing you know someone uses the you know the Cosmos platform or your new I think it's like a Groot it's called What is it called? Groot? Groot, that's our project for humanoid robots for Yeah. building and training humanoid brains. So all right. So Groot you know that some company uses Groot to start to put a humanoid work for humanoid labor in a let's say a factory or even as a care robot and I'm a nurse and all of a sudden some Groot built robot is now helping take care of the elderly. What are the labor implications of that? Well, first and foremost, I think we need to understand that uh this is a really hard problem. It's not like overnight we're going to have robots replace everything humans do everywhere. It's a very very difficult problem. We're just now at an inflection point where we can finally um we we see a line of sight to to building the technology we needed to unlock the possibility of these kind of general purpose robots. And that's we can now build a general purpose robot brain. 20 years ago that was not true. We could have built the physical robot, the actual body of a robot, but it would have been useless because we couldn't give it a brain that would let it operate in the world in a general purpose manner. We couldn't interact with it or program it in a useful way um to to do anything. So so that's that's what's been unlocked here. I talk to a lot of uh CEOs and and executives for uh companies in the industrial sector and manufacturing and uh warehousing um to companies in uh to retail companies um in all of these companies I talk to in every geography there's a recurring theme. There's [snorts] a demographic problem the whole world is facing. We we don't have as many young people who want to do the jobs that the older people who are retiring now have been doing. If you go to an automotive factory in um in Detroit or in Germany go look around. Most of the factory workers are aging and they're quickly retiring. And and these CEOs that I'm talking to their biggest concern is all of that knowledge they have on how to operate those factories and work in them it's going to be lost. The young people don't want to come and do these jobs. And so we have to solve that problem if we're going to maintain uh not just grow our economy but just maintain where the economy is at and produce the same amount of things we need to find some solution to to to this to this problem. We don't have enough workers. We've been seeing it in transportation. There's not enough truck drivers in the world to go deliver all the stuff that's moving around in our supply chains. We can't hire enough of them and there's less and less young people that want to do the that job every year. So we need to have self-driving trucks. We need to have self-driving cars to to solve that problem. So I think before we talk about replacing jobs that humans want to do we should first be talking about the uh using these robots to fill in the gap that's being left by humans because they don't want to do it anymore. Right and there could be specialization like take nursing for example. The nurse that injects me with a vaccine or the nurse that like puts medication in my IV, maybe we keep that human for a while even though you know they make mistakes too but I'd feel a lot more comfortable if that was human. The nurse that takes me for a walk down the hall after I've gotten a knee replacement uh that could be a robot. It'd be better to have a robot. we'll see how this plays out. Uh we're we believe that the first place we're going to see general purpose robots like the humanoid robots really take off is in the industrial sector because of two things. One, the demand is great there because we have the shortage of workers. Um and also because uh it makes it makes more sense to have them adopted in these spaces where a company just decides to put them in there and mostly warehouses and factories are kind of unseen. I think the last place we're going to start seeing humanoids show up is in our homes. in your your kitchen. Don't tell Jeff Bezos that. Well, they will show up there and I think it's going to be uneven. It'll depend uneven geographically. They'll probably show up in a kitchen in somebody's home in Japan before they show up in a kitchen in somebody's home in in Munich. in Germany. And I think that's a cultural thing. Um you know I I personally don't even want another human in my kitchen. I like being in my kitchen and and uh preparing stuff myself. My wife and I are always in each other's space there so we get kind of annoyed. So having having a humanoid robot would be kind of weird. I don't I don't even want to hire somebody else to do that. We kind of do that ourselves. So that's a kind of personal decision. I think things like um jobs like caring for our elderly and um and health care those are very human uh human professions. You know, there's a lot of a lot of what the care is it's not really about the physical thing that they're doing. It's about the emotional connection with another human. And for that um I don't I don't think robots are going to take that away from us anytime soon. Well, the question is do we have enough care professionals to take those jobs? That's the one that really seems in danger. likely to happen is it'll be a combination. The care professionals we do have will do the things that require EQ that require empathy that requires you know really understanding the other human you're taking care of. And then they can instruct the robots around them to to assist them to do all of the more mundane things like cleaning and and maybe maybe giving the shots and IVs. I don't know. How long away is is that future, Rev? What do you how long do you think? Um you know, I wouldn't venture to guess on on that kind of interaction in a in a hospital or a care situation quite yet. I believe it's going to happen in the industrial sector first and I believe that it's within a few years we're going to see it uh we're going to see humanoid robots um widely widely used in the most advanced uh manufacturing and warehousing. Wild. Okay, I want to ask you about Hollywood before we go. Um I guess I have this question rattling in my mind which is are we just going to see like movies not that movies that look real but are computer generated? Like we have computer generated movies now with the CGI but they all look uh, pretty CGI-y. But, I imagine we'll don't all look CGI-y. Some of them look pretty amazing. But, I'm I'm curious like do you think that like is Hollywood going to move to a area where it's super real and just simulated? You go ahead. Absolutely. I mean, well, um, was it a year or two ago when the last Planet of the Apes came out? I went to go see it with my wife. Now, my wife, uh, and I have been together since I worked at Disney in the mid-90s working on visual effects and rendering. We I had a a startup company doing rendering and she was a part of that. So, she she has a good eye and she she's been around computer graphics and rendering for decades now. When we went to go see Planet of the Apes, even though obviously those apes were not real, at one point she turned around and said, "That's all CG, right?" She couldn't quite believe it. I think what Weta did there is is amazing. It's indistinguishable from real life except for the fact that the apes were talking. Like, other than that, [laughter] it's indistinguishable. The the problem with with that though is to do that level of CG in the traditional way that we've done it requires an incredible amount of artistry and and skills that only only a few studios in the world can do with the teams that they have and the pipelines they've built and it's incredibly expensive to produce that. What we're building with AI, with generative AI and particularly with world foundation models that once we get to the point where they really understand the depths of the the physics that they need to to produce something like Planet of the Apes, once we have that, of course of course they're going to use those technologies to produce the same images cuz it's going to it's going to be a lot faster and it's going to be a lot a lot less expensive to do the same things. It's already starting to happen. Rev, I know we're getting close to time. Do I have time for two more questions or Absolutely. Okay. So, the more I think about robotics, the more I think about sort of what the application in war might be. I know that like you can't think of every permutation when you're developing the foundational technology, but we are living in a world where war is becoming much more roboticized and it's sort of like remarkable that, uh, we have some wars going on where people are still fighting in trenches. Um, so I'm just curious if you've had given any thought to like how robotics might be applied in warfare and whether there's a way to prevent some of like the the bad uses, uh, that might come about because of it. You know, I'm I'm not really an expert in in warfare, so I I don't feel that I'm the best person to to talk about how it might be used or not, but I can say this. Um, this isn't the first time where a new technology has been introduced that, um, uh, is so powerful that not only can we imagine great uses of it that are beneficial to people, but also really really scary, devastating consequences of it being used particularly in warfare. And somehow we've managed to to, um, not not have that kind of devastation. And in general, the world has gotten better and better, more peaceful and safer despite what it might feel like today. By almost any measure, we have less lives lost through wars and and, um, uh, these sorts of tragedies than ever before in in mankind's history. Uh, the big one of course everybody always talks about is, uh, nuclear technology. I mean, uh, I I grew up I was a little kid in the '80s. Uh, this is kind of the height of the Cold War, the end of it. But, every day I remember thinking, thinking, you know, it's might happen. We might we might have some ICBMs, um, arrive in Los Angeles at any point. And it hasn't happened because somehow um, the general understanding by everyone collectively such that this would be so bad for everyone that we put together systems even though we had intense rivalry and even enemies, uh, um, um, between between the Soviet Union and the US, um, we somehow figured out that we should create a system that prevents that sort of thing. We've done the same with biological weapons and chemical weapons. Largely they haven't been used even though the technology's existed there. And so, [clears throat] I think that's a uh that's a good indicator of of of what's how how how we should deal with this new technology, this new powerful technology of AI, and a reason for us to be optimistic that it's possible to to actually have this technology and not have it be, um, so devastating. We can set up rules and conventions that say even though it's possible to use AI in this way that we shouldn't and we should all agree on that. And anybody that skirts the line on that, you know, there should be, uh, uh, ramifications to it to to disincentivize them from using it that way. Yeah. I I hope you're right on that. It seems like it's something that we're going to as a society deal with more and more as this stuff becomes more advanced. All right, so, last one for you. You've been at Nvidia, we've talked about a couple times, 23 years. I already teased this. So, um, I want I just want to ask you, you know, the technology's been in favor, it's not been in favor. Uh, you know, you you're at the top of the world right now, um, even though, you know, there was some hiccup last week, but whatever. Doesn't seem like it's going to be a long-term issue. Just what's what is like one insight you can tell us, uh, you know, that you can draw from your time at Nvidia about the way that the technology world works. About Well, first I can tell you about how Nvidia works and the reason I'm here, uh, I've I've been here for 23 years and this will be the last job I ever have. I'm positive of it. When I joined Nvidia, that wasn't the plan. I thought I'd be here 1 year, 2 years max. And, uh, now it's been 23 years. When I hit my 20-year mark, um, Jensen at our next company meeting had rattled off a bunch of stats on how long various groups have been here, how many how many people had been there for a year, 2 years, and so on. When he got to 20, there were more than 650 people Wow. that were at 20 year. Now, earlier I had said when I joined the company there were about a thousand people. So, this means that most of the people that were there when I started at when I started Nvidia were still there after 20 years. Uh, I wasn't as special as I thought I was when I hit my 20-year mark. And so, this is actually a very strange thing about Nvidia. We have people that that have been here a long time and haven't left. It's strange in general for most companies, but particularly for Silicon Valley tech companies, uh, people move around a lot. And I believe the reason why we've stayed here through, uh, through all of our trials and tribulations and whatnot is because fundamentally, uh, what Jensen has built here is a company where people come to do their life's work. And we really mean it. Like, you feel it when you're here. This is more than just just about, um, making some money or having a job. Like, you come here to do great work and to do your life's work. And so, the idea of leaving just it feels painful to me. Uh, and I think it is to to many others. Um, that's what's actually, I think, behind why, despite the fact that Nvidia's had its ups and downs, and you can go back, um, to look look at our stock chart going back to like, uh, the mid-2000s. We introduced CUDA in 2006. And that was a really important thing and we stuck to it. The the analysts and nobody wanted us to keep sticking to it, but we kept investing in it and our stock price took a huge hit and it was flat there for a long time, flat or dropping. And then it finally happened. AI was born on our GPU. That's what we were waiting for. And we went we went all in on that and we've had ups and downs since then. Um, we'll we'll continue to have ups and downs, but I think the trend is going to still be up into the right, um, because uh, this is an amazing place where where people who want to do their life's work, the best people in the world at what we do, want to do their life's work, they come here and they stay here. Yeah. Well, Rev, look, it's always, uh, such a pleasure to speak with you. I really enjoyed our time together at at headquarters. It was a really fun [music] day. We did some cool demos and I appreciate that. And I'm just thrilled to get a chance to speak with you about this technology today. It is fascinating technology. [music] It is cutting edge. Obviously brings up a lot of questions, some of which we got to today. Sure we could have talked for 3 hours. And I hope to keep the conversation [music] up. So thanks for coming on the show. Thank you for inviting me and hope we do talk for 3 hours one day. That'll be great. All right everybody, thank you for listening. Ranjan and I will be back to break down the news on Friday. Or pretty a lot of news this week with Open AI's deep research coming out. I just paid $200 for ChatGPT, which is a lot more than I ever thought I would for a month, but that's where we are today. So we're going to talk about that and more on Friday. Thanks for listening and we'll see you next time on Big [music] Technology Podcast.