Why We Don’t Need More Data Centers - Dr. Jasper Zhang, Hyperbolic

Channel: aiDotEngineer

Published at: 2025-08-01

YouTube video id: M6Vbaig1TsM

Source: https://www.youtube.com/watch?v=M6Vbaig1TsM

[Music]
Nice meeting you guys. Uh great to be
here and uh I'm here to present
hyperbolic which is AI cloud for
developers. And so my topic is uh why
we don't need more data centers. It's
like a very eye-catching title. Uh but I
what I want to clarify is I still think
b building data centers is important but
just building data centers alone can
solve the problem.
So uh
wait before we get started uh let me
introduce myself. I'm Jasper. I'm the
CEO and co-founder of Hyperbolic. Um I
did my math PhD at UC Berkeley. Uh
finished my PhD in two years which make
me the fastest person in the history of
Berkeley. And then I also won a few gold
medals. So after that I work at set of
securities uh trying to use AI and
machine learning to predict the market
execute strategy. So I always have a
passion about how to make things very
efficient and how to help you to save
money because everyone knows that uh
compute actually one of the biggest cost
for your companies or for your startups.
Uh usually you need to if you want to
rent like 1,000 GPU will spend you
millions of dollars uh per year. And we
we think that these problems should be
solved by not just building more data
centers but actually uh building a GPU
marketplace. So let's get started uh
with the problem that we're facing. Uh
first uh I think so so everyone knows
that AI is going to integrate with
everything in the future and every
companies will be AI companies. So the
demand for GPUs as well as data centers
uh exploding. So by McKenzie by 2030
we'll need four acts more data centers
built in one quarter of the time that we
build in the speed.
Uh but what if I tell you that you
actually don't need that many data
centers uh you actually need uh another
solution. So uh we can break down the
demand first. Uh right now uh the
current capacity for data center is 55
gawatt. Um
by the median uh scenario we're going to
see 22% annual growth rate for the
demand. So in 2030 we're going to need
219 gawatt.
And however uh it's like there are a lot
of challenges building data centers
right. So first uh we everyone knows
Stargate. So it takes like uh for the
first Stargate data center it takes like
more than a billion dollars to build. Uh
and then also it's very slow to collect
data center to the electrical grid. For
example, right now the the weight weight
list is like seven years. So you need to
wait seven years to connect a 100
megawatts facility to the uh to the
electric electrical grid in n uh
northern Virginia.
and uh and then uh it also very uh
consuming a lot of energy. So uh
currently we're spending 4% of the total
electricity consumption in the US for
just GPUs and data centers. uh and also
is not very environmental sus
sustainable. Uh if you can look at the
number that's crazy uh CO2 emissions
annually
and even say if we're going to deliver
all the data centers uh on time, there's
still a data center supply deficit of
more than 15 gawatts in the US alone by
2030. And so it means that
just building data center can solve the
problem.
On the other hand, uh we think the GPU
utilization is actually pretty low. So
according to uh deote, GPU sit idle 80%
of the time for enterprises and
companies
according to s analysis there is this
100 plus GPU clouds. So we can see like
how fragmented the space is right a lot
of you guys need GPUs but you can't find
them or like you are going to pay
extremely high price on the other hand
there are a lot of GPUs sit idle in data
centers or in different clouds and so
naturally uh a solution that we think we
we should build is actually build a GP
marketplace or like aggregation layer
that aggregate different data centers
and GP providers
to solve the problem for uh GPU users.
Uh it doesn't necessarily need to be
hyperbolic, but I just use hyperbolic as
an example uh to show here.
So uh I can I can just like uh share
what we are we're trying to solve. So
we're building this like global
orchestration layer. Uh we invented a
software called hyperdos which is short
for hyperbolic distributed uh operating
system. So basically it's like a
kubernetes u software. So any any
cluster as long as it installed our
software within five minutes suddenly
the data center become a cluster in our
network and on the other side users can
rent GPUs uh in different ways that they
want like they can just uh do the spot
instance they can like on demand they
can long-term reserve or they can also
like host models on top.
Um and so like we see that we see that
there are like several benefits. Um one
we uh we got kind of like solve the e uh
like the matching problem of compute. Uh
and then second like GPU become
commodities. So you like you don't need
to spend too much time to wait for data
center you just buy them on the
marketplace. And then third uh you can
have different options.
And so um we do some math modeling. Uh I
I mean I don't have time to kind of put
down the math in the slides but this is
our conclusion right basically uh we can
save the cost by 50 to 75%. Uh even if
you look at uh the current we we're
running like some beta version of our
marketplace right now and our GPU cost
for H100 is 99 cents per hour. But if
you look at Google for example, they
have on demand GPU. It's like $11.
They're like lambda. They have like $2
or $3. But on average by aggregating
more supply uh and then like have a
uniform distribution channel, you can
dramat uh drastically reduce the price.
Um it's like the the theory behind that
is like the queueing theory basically
like uh is MMC theory. I probably next
time if we're going to watch my talk, I
will share more math uh behind that. Uh
but yeah and then like you can just save
time to vetting your suppliers because
you if you like think about I I mean how
many people here are founders or like
need to acquire GPUs?
Yeah. So uh are you frustrated when you
are trying to talk to how many suppliers
are you talking to? If you have talked
to more than five raise your hands.
Are you frustrated when you like trying
to have like five sales calls and like
try to like know which status uh GPUs
are are frustrated? Yeah. Are good.
Yeah, that's great. Yeah. So, basically
by having like this uniform platform
like founders or like startups or
companies no longer need to vet
different data center. They just like
pick the one that they uh have high
rating or like have the best price.
We're also going to do like uh
benchmarking on the performance of the
GPUs.
All right. So, uh
Oh, sorry.
All right. So, uh sorry. Somehow the
graph didn't didn't show.
Give me one sec.
Yeah. So, um, basically we can think
about a use case example. Um, so let's
say if you if you are a startup and you
want like 1,000 GPUs at the beginning.
So, usually you will just reserve these
10,000 GPUs for a year, right? You think
like I might need to use these GPUs uh
for training and later on I want to do
inference. And so you run some training
jobs and then after three months then
you realize that okay now I have a I
have a n good a better idea by running
those experiments and now I need 1,000
more GPUs just for a month right and
then after after six months uh at month
takes then you finish your training job
and then you realize that now I only
need 500 GPUs for hosting my model but I
still have 500 GPU
So uh on the traditional on hyperbolic
case uh you basically can say okay I
will rent 1,000 GPUs for a year at the
beginning but then uh in month three I
can say uh I just rent uh an an actual
10,000 GPUs for just uh a month and then
uh a month in month six then I can say
okay I can release my idle GPUs on
hyperbolic and try to sell to uh sell
them to the uh to other people that need
them, right? Uh but if you just like use
on traditional cloud, then you need to
rent 1,000 GPUs at the beginning and
then on month in month three, you need
to rent actually 10,000 GPUs for a year
usually. And uh if you calculate the
cost uh compare compare that and then
also like think about the price
difference you will have um it will you
can reduce the cost from 43.8 8 million
to 6.9 million. So it's like 6x saving.
Uh and you also help other people to get
cheaper GPUs too because you can release
those idle GPU to other people. And so
uh so that's this is how we think that
uh we're gonna we're gonna like increase
the productivity like people only think
about saving but actually uh this is not
true for GPU right uh by scaling law we
know that the more compute you spend the
better quality your machine will be uh
your model will be so it's not just
about saving your cost by 6x it's more
about with the same budget you will
increase your productivity by 6x. And
imagine how many startups that they used
only need to rely on open AI and
anthropic those closed AI models. But
now suddenly they their money become
more valuable and they can rent as many
GPUs as they want for the training.
Um and so the the next step that that we
think uh usually the GP marketplace will
evolve into is that uh it will be a
allin-one platform for different AI
workload because what people really want
is not just GPUs they want um to run
their different AI jobs right they will
you will have AI inference uh online
inference uh offline inference and then
you will also have uh training job
and so Uh yeah so this is like um two
two like uh some takeaway like basically
we don't think we we need like just
focus on building data centers we also
need to do like smarter allocation for
the resources and then second uh we can
reduce your cost um for by building GPA
uh marketplace and lastly um I think uh
just focusing on building data center is
not very sustainable we're costing a lot
of energy uh taking a lot of land uh we
should better reuse recycle those idle
compute by uh selling it to others. So
uh if you're interesting in trying out
uh you can uh come to our website uh the
the left QR code is uh the current
product that we have which is a
marketplace but then we're also
launching our business cloud and
enterprise cloud that uh give you like
production ready GPUs with 99.5%
reliability. All right. Thanks.
Awesome. So I actually got I'm curious.
Can you tell us more about the the kind
of hyperbolic OS? How exactly does that
turn because I know a lot of times you
have a data center plus a set of GPUs.
How how does it actually work to connect
it to hyperbolic itself? Yeah. So um
basically this is hyper hyperdos is like
a kubernetes agent. So um you just
install that in your cluster as long as
you have kubernetes. I mean uh most data
center have Kubernetes but then even for
your MacBook or for your PC you can just
install like micro K8 to kind of uh
become a Kubernetes ready uh machine and
uh so basically now you kind of have we
we have terminology in house we call
like our hyperbolic
server uh monarch and then we have uh
different baronss so it's like a feudal
laser model So different varants they
own different compute and then anytime
every every time when a user want to
rent GPU they will talk to our monarch
server and the monarch server will send
a request to uh the like the baron and
then baron will basically uh pro
provision the machines and set up the
ssh instance for customers to access.
Yeah.