Why We Don’t Need More Data Centers - Dr. Jasper Zhang, Hyperbolic
Channel: aiDotEngineer
Published at: 2025-08-01
YouTube video id: M6Vbaig1TsM
Source: https://www.youtube.com/watch?v=M6Vbaig1TsM
[Music] Nice meeting you guys. Uh great to be here and uh I'm here to present hyperbolic which is AI cloud for developers. And so my topic is uh why we don't need more data centers. It's like a very eye-catching title. Uh but I what I want to clarify is I still think b building data centers is important but just building data centers alone can solve the problem. So uh wait before we get started uh let me introduce myself. I'm Jasper. I'm the CEO and co-founder of Hyperbolic. Um I did my math PhD at UC Berkeley. Uh finished my PhD in two years which make me the fastest person in the history of Berkeley. And then I also won a few gold medals. So after that I work at set of securities uh trying to use AI and machine learning to predict the market execute strategy. So I always have a passion about how to make things very efficient and how to help you to save money because everyone knows that uh compute actually one of the biggest cost for your companies or for your startups. Uh usually you need to if you want to rent like 1,000 GPU will spend you millions of dollars uh per year. And we we think that these problems should be solved by not just building more data centers but actually uh building a GPU marketplace. So let's get started uh with the problem that we're facing. Uh first uh I think so so everyone knows that AI is going to integrate with everything in the future and every companies will be AI companies. So the demand for GPUs as well as data centers uh exploding. So by McKenzie by 2030 we'll need four acts more data centers built in one quarter of the time that we build in the speed. Uh but what if I tell you that you actually don't need that many data centers uh you actually need uh another solution. So uh we can break down the demand first. Uh right now uh the current capacity for data center is 55 gawatt. Um by the median uh scenario we're going to see 22% annual growth rate for the demand. So in 2030 we're going to need 219 gawatt. And however uh it's like there are a lot of challenges building data centers right. So first uh we everyone knows Stargate. So it takes like uh for the first Stargate data center it takes like more than a billion dollars to build. Uh and then also it's very slow to collect data center to the electrical grid. For example, right now the the weight weight list is like seven years. So you need to wait seven years to connect a 100 megawatts facility to the uh to the electric electrical grid in n uh northern Virginia. and uh and then uh it also very uh consuming a lot of energy. So uh currently we're spending 4% of the total electricity consumption in the US for just GPUs and data centers. uh and also is not very environmental sus sustainable. Uh if you can look at the number that's crazy uh CO2 emissions annually and even say if we're going to deliver all the data centers uh on time, there's still a data center supply deficit of more than 15 gawatts in the US alone by 2030. And so it means that just building data center can solve the problem. On the other hand, uh we think the GPU utilization is actually pretty low. So according to uh deote, GPU sit idle 80% of the time for enterprises and companies according to s analysis there is this 100 plus GPU clouds. So we can see like how fragmented the space is right a lot of you guys need GPUs but you can't find them or like you are going to pay extremely high price on the other hand there are a lot of GPUs sit idle in data centers or in different clouds and so naturally uh a solution that we think we we should build is actually build a GP marketplace or like aggregation layer that aggregate different data centers and GP providers to solve the problem for uh GPU users. Uh it doesn't necessarily need to be hyperbolic, but I just use hyperbolic as an example uh to show here. So uh I can I can just like uh share what we are we're trying to solve. So we're building this like global orchestration layer. Uh we invented a software called hyperdos which is short for hyperbolic distributed uh operating system. So basically it's like a kubernetes u software. So any any cluster as long as it installed our software within five minutes suddenly the data center become a cluster in our network and on the other side users can rent GPUs uh in different ways that they want like they can just uh do the spot instance they can like on demand they can long-term reserve or they can also like host models on top. Um and so like we see that we see that there are like several benefits. Um one we uh we got kind of like solve the e uh like the matching problem of compute. Uh and then second like GPU become commodities. So you like you don't need to spend too much time to wait for data center you just buy them on the marketplace. And then third uh you can have different options. And so um we do some math modeling. Uh I I mean I don't have time to kind of put down the math in the slides but this is our conclusion right basically uh we can save the cost by 50 to 75%. Uh even if you look at uh the current we we're running like some beta version of our marketplace right now and our GPU cost for H100 is 99 cents per hour. But if you look at Google for example, they have on demand GPU. It's like $11. They're like lambda. They have like $2 or $3. But on average by aggregating more supply uh and then like have a uniform distribution channel, you can dramat uh drastically reduce the price. Um it's like the the theory behind that is like the queueing theory basically like uh is MMC theory. I probably next time if we're going to watch my talk, I will share more math uh behind that. Uh but yeah and then like you can just save time to vetting your suppliers because you if you like think about I I mean how many people here are founders or like need to acquire GPUs? Yeah. So uh are you frustrated when you are trying to talk to how many suppliers are you talking to? If you have talked to more than five raise your hands. Are you frustrated when you like trying to have like five sales calls and like try to like know which status uh GPUs are are frustrated? Yeah. Are good. Yeah, that's great. Yeah. So, basically by having like this uniform platform like founders or like startups or companies no longer need to vet different data center. They just like pick the one that they uh have high rating or like have the best price. We're also going to do like uh benchmarking on the performance of the GPUs. All right. So, uh Oh, sorry. All right. So, uh sorry. Somehow the graph didn't didn't show. Give me one sec. Yeah. So, um, basically we can think about a use case example. Um, so let's say if you if you are a startup and you want like 1,000 GPUs at the beginning. So, usually you will just reserve these 10,000 GPUs for a year, right? You think like I might need to use these GPUs uh for training and later on I want to do inference. And so you run some training jobs and then after three months then you realize that okay now I have a I have a n good a better idea by running those experiments and now I need 1,000 more GPUs just for a month right and then after after six months uh at month takes then you finish your training job and then you realize that now I only need 500 GPUs for hosting my model but I still have 500 GPU So uh on the traditional on hyperbolic case uh you basically can say okay I will rent 1,000 GPUs for a year at the beginning but then uh in month three I can say uh I just rent uh an an actual 10,000 GPUs for just uh a month and then uh a month in month six then I can say okay I can release my idle GPUs on hyperbolic and try to sell to uh sell them to the uh to other people that need them, right? Uh but if you just like use on traditional cloud, then you need to rent 1,000 GPUs at the beginning and then on month in month three, you need to rent actually 10,000 GPUs for a year usually. And uh if you calculate the cost uh compare compare that and then also like think about the price difference you will have um it will you can reduce the cost from 43.8 8 million to 6.9 million. So it's like 6x saving. Uh and you also help other people to get cheaper GPUs too because you can release those idle GPU to other people. And so uh so that's this is how we think that uh we're gonna we're gonna like increase the productivity like people only think about saving but actually uh this is not true for GPU right uh by scaling law we know that the more compute you spend the better quality your machine will be uh your model will be so it's not just about saving your cost by 6x it's more about with the same budget you will increase your productivity by 6x. And imagine how many startups that they used only need to rely on open AI and anthropic those closed AI models. But now suddenly they their money become more valuable and they can rent as many GPUs as they want for the training. Um and so the the next step that that we think uh usually the GP marketplace will evolve into is that uh it will be a allin-one platform for different AI workload because what people really want is not just GPUs they want um to run their different AI jobs right they will you will have AI inference uh online inference uh offline inference and then you will also have uh training job and so Uh yeah so this is like um two two like uh some takeaway like basically we don't think we we need like just focus on building data centers we also need to do like smarter allocation for the resources and then second uh we can reduce your cost um for by building GPA uh marketplace and lastly um I think uh just focusing on building data center is not very sustainable we're costing a lot of energy uh taking a lot of land uh we should better reuse recycle those idle compute by uh selling it to others. So uh if you're interesting in trying out uh you can uh come to our website uh the the left QR code is uh the current product that we have which is a marketplace but then we're also launching our business cloud and enterprise cloud that uh give you like production ready GPUs with 99.5% reliability. All right. Thanks. Awesome. So I actually got I'm curious. Can you tell us more about the the kind of hyperbolic OS? How exactly does that turn because I know a lot of times you have a data center plus a set of GPUs. How how does it actually work to connect it to hyperbolic itself? Yeah. So um basically this is hyper hyperdos is like a kubernetes agent. So um you just install that in your cluster as long as you have kubernetes. I mean uh most data center have Kubernetes but then even for your MacBook or for your PC you can just install like micro K8 to kind of uh become a Kubernetes ready uh machine and uh so basically now you kind of have we we have terminology in house we call like our hyperbolic server uh monarch and then we have uh different baronss so it's like a feudal laser model So different varants they own different compute and then anytime every every time when a user want to rent GPU they will talk to our monarch server and the monarch server will send a request to uh the like the baron and then baron will basically uh pro provision the machines and set up the ssh instance for customers to access. Yeah.