AI Scaling, Alignment, and the Path to Superintelligence — With Dwarkesh Patel

Channel: Alex Kantrowitz

Published at: 2024-05-15

YouTube video id: 2PXHZqv2-qc

Source: https://www.youtube.com/watch?v=2PXHZqv2-qc

One of the sharpest minds in AI joins us
to look at the research, the business,
the dangers and the conspiracies, all
coming up right after this. Welcome to
Big Technology Podcast, a show for
cool-headed, nuanced conversation of the
tech world and beyond. We're joined
today by Dwarkesh Patel. He's the host
of the Dwarkesh Podcast and he's had
DeepMind CEO Demis Hassabis on recently,
along with Anthropic CEO Dario Amodei,
the OpenAI Chief Scientist Ilya
Sutskever, and recently Meta CEO Mark
Zuckerberg. So, that's the person we're
dealing with here today. Dwarkesh,
welcome to the show. Thanks for having
me. Super excited to be on, Alex. Yeah,
I think that you've been doing some
great interviews on AI, looking at the
research, where things are going, and
and really asking the right questions to
the right people. You had Ilya Sutskever
on
before the entire OpenAI blowup. How
important do you think he is to the
OpenAI's ability to be competitive?
I actually think that's a really
interesting question. I remember I was
chatting with someone and they said
something along the lines of, well, now
that they've lost Ilya, you know, I
think the great people matter a lot and
that since this person was lost, it
might be
downhill for OpenAI. Then, I you know, I
think like the the default perspective
is, listen, you've got thousands of
scientists who are doing AI, surely any
one of them is replaceable. I think
that's probably correct, but I'm not in
the field enough to know that and
it it is a sort of interesting empirical
question of how what is the bus factor
of a place like OpenAI? If you lose a
chief scientist, how much does that slow
down your progress?
And it would be interesting if it
doesn't slow it down that much. It would
be super interesting if it slows it down
a lot,
but yeah, that's a really interesting
question. Have you seen any signs of
them slowing since his departure?
No, I mean, well, the big question
people have had is since GPT-4, which
was released
more than a year ago, the we haven't
really gotten anything better, right?
So, we've gotten Claude 3, we've gotten
Gemini. They're not significantly
better, if at all, than GPT-4 and
certainly not the newer version of
GPT-4. So, the question is, is AI
progress plateauing or are people just
waiting to build out the giant data
centers which are necessary for training
a GPT-5 level model? And actually, I
think this year will be super
interesting in terms of learning about
AI because
by the end of this year, we'll get to
see, hopefully, what a GPT-5 level model
looks like and we'll learn whether we're
on the path to some kind of
superintelligence. If GPT-5 is so
amazing, we're like, okay, well, we're
we're building a god a few years out. Or
if GPT-5 is not that much better than
four, and
I think the main thing we're going to
learn is between
4.5 and five level models, we're going
to hit what's known as the data wall,
which is to say that as you make these
models bigger, you need more and more
data to keep training them and we're
running out of internet data. And so,
we're going to learn whether synthetic
data, RL, these other techniques can
substitute for the data that bigger
models will need. And I I mean, by the
end of this year, I'm expecting to learn
a lot about what the course of AI is
going to look like. What is your sense
as to what the answer will be with
GPT-5?
Yeah, I mean, that's a good question.
And and it's
Yeah, but I'm going to ask you anyway.
Yeah.
Let's see. So,
here's here's some predictions I have.
I'm not sure if it gets to the heart of
what will be impressive about it. I
think it'll definitely be better at
reasoning, which is trivial to say
because the training methods that we've
seen them talk about, like you I'm sure
you heard them talk about Q-Star and
what it seems to be is training the
model to
rewarding it on getting the correct
reasoning trace to get the right answer.
And that seems to lead to better
reasoning
or at least in pub there's equivalent
papers that are released publicly called
Quiet-Star that claims that it does.
Then, we're going to see much more
multimodal data and I think that'll look
a lot like the equivalent of supervised
fine-tuning, but for a bunch of people
recording their screen and doing
workflows with their screen, navigating
UIs. So, I think you'll have agents that
can act coherently as your assistant for
potentially minutes, if not hours, on
end in a way that you can just tell them
to go do something on the internet and
they can actually do it. Where like
GPT-4 right now, web browsing isn't
really a big feature. Like people like
Perplexity use it, but it's mostly to
summarize. It's not to like actively
go out and look for information. So, I
hoping well, I'm expecting that to be
one of the things you'll see with GPT-5.
I guess the big question is like how
much smarter is it, right? Like I'm
mentioning all these off abilities it
might have, but like how much juice will
it have? And I honestly don't know.
Right. And so, how do you think we're
going to be able to assess that? Like is
it just going to be like how much better
it does on tests or is do you think it's
just going to be a feel when people are
using the model?
Yeah, I trust the feel more, honestly,
cuz I mean, we have these evaluations.
We have these evaluations, but I
what's your sense on them? I mean, like
people will come out and say, here's
what we got at MMLU and so on, but
they're getting saturated and they're
not often not that great to begin with.
So, I I'm I'm more eager to see what it
feels like to talk to one of these
things than learn what its MMLU score
is. Yeah, I agree. I mean, it is
interesting how feel plays such a big
role into evaluating these models
because they are You talk about this a
lot, scientific, right? And we can sort
of test them scientifically and we have
these big things like parameters and
you know, the size of the the compute
that we use, but ultimately feel is like
one of the great things that we use to
test how good these models are. And in
fact, like the thing that people always
look at in terms of model performance is
this chatbot arena where they put two
answers from different chatbots side by
side and say, okay, well, which one is
the best? And that's kind of it seems
janky, but it's also the thing that
people take almost as gospel now in the
AI industry.
Yeah, and in fact, I think even chatbot
arena has some deficiencies in terms of
evaluation because from what I
understand, you're doing these pairwise
comparisons, but you're doing them
you ask a question and two of them
respond. And what I'm more curious about
is what is it like to have a long-form
conversation with the thing where it's
not just what it immediately responds
like, but if we keep talking, you know,
can you kind of like understand the
context of the problem I'm talking
about? Can you keep following up on
different threads? That's more relevant
to my workflow than just immediately
producing some bullet points.
Yeah, and we're going to get into this
evaluation and research a little bit
more as we keep going, but just to go
back to OpenAI.
So, it is interesting to watch Microsoft
now start to develop their own models,
almost to compete with OpenAI. And there
was a story in The Information recently
that Microsoft is working on a 500
billion parameter model. Just for
context, this is kind of like the size
of the model listeners for listeners and
OpenAI's GPT-4 was apparently trained on
something like a trillion parameters or
apparently has a trillion parameters.
But Microsoft is now making this move
where it's trying to build the for I
think for the first time since GPT-4 its
own model that competes with it. How how
should we read into what Microsoft is
doing there? Do you think it is a loss
of faith in OpenAI, it's a hedge against
OpenAI, sort of an a necessary move even
though it has such an important partner?
What's your take?
Yeah, one of my friends who works at
DeepMind told me that Microsoft is
basically reversing what Google has
managed to do over the last few years.
And in fact, making the same mistake
that Google initially made, which was to
have its training distributed or split
up between two different corporations or
institutions. For Google, it was a
brain, Google Brain and DeepMind. And
so, Microsoft has a company which is in
the lead, right, OpenAI. And I guess
instead of doubling down on it, they're
trying to hedge their bets in this way.
I think if you think that AI is like
another product where you have multiple
vendors, so you can be sure that if one
of them you know, has decides to go to a
different route, you have some leverage
over them. That that might make sense
for another kind of product. The thing
with AI is if you buy scaling in this
picture that as you make the models
bigger, they get much smarter, then I
don't think it makes sense to hedge your
bets in this way. I think you should
just double down, give give one of them
a hundred billion dollars and just say
like go make me go make me
superintelligent, you know what I mean?
Like cuz you then you're just splitting
up your efforts and yeah, like it would
be
much better to have one GPT-4 than two
companies that have a
you own two companies that have a
GPT-3.5.
And I'm sure Microsoft knows this. So,
what do you think could possibly be
their reasoning for doing what they're
doing?
Partly, I think they got spooked by what
happened with the board
last December and November.
I I I I think definitely that that
there's that cuz the clause in the
OpenAI charter says that if the board,
which is nonprofit and controls the
company, decides that we've built AGI,
then Microsoft has no leverage over over
OpenAI anymore. So, they got spooked by
that.
And I think partly it's probably
Microsoft doesn't buy the scaling
hypothesis as much as us internet
weirdos cuz they probably think like,
oh, you know, we want to like
we want to diversify our bets here and
we'll have multiple companies build a
GPT-4.5 level model and we'll see how
that goes.
Yeah,
yeah, I mean, I'm sure there's better
reasons too, right? Like you can build
this in-house talent. I'm sure there's a
lot of practical knowledge you
understand by building smaller models
and you're getting a lot of that
knowledge in-house by training these
models. own ability to to train and
understand and deploy these models
improves.
Um yeah.
And can you just handicap the field for
us? I mean, how should we think about
the efforts of DeepMind versus Meta
versus Anthropic versus OpenAI? Is there
a clear leader there or is there any
sort of key differentiators that are
important to know? I know it's like a
broad question, but feel free to zero in
as you as you'd like. Yeah, this is a
good question.
Yeah, it doesn't seem like there's a
strong leader at the moment. I think in
terms of revenue, probably OpenAI is
leading by a lot. I think just the
amount of people who use ChatGPT versus
any other service. I mean, subjectively,
you use Claude and it's often better,
not significantly worse in any case. I
think the main way in at least in which
they're today different is Claude seems
to have better post-training, which is
to say which is the jargon for basically
saying like what kind of personality
does it have and how does it how does it
break down your question and how does it
act how does it act as a persona of a
chatbot? Um and so all this RLHF stuff
you hear is part of the post-training
and Claude has a more automated um uh
way of doing that uh or Anthropic which
controls Claude does. Gemini has longer
context obviously, but um so you know,
million tokens which is huge. I think
the big thing we'll we'll probably learn
in the next few months is who's really
ahead cuz in the everybody's just been
releasing models that are as good as
GPT-4 right now and we'll learn who can
go the distance so to speak. And I think
the big question will be OpenAI probably
has the compute because of Microsoft,
but then again, maybe the uh Microsoft
is splitting up its compute. Um Google
has definitely has the compute because
you know, Google a huge company. Uh the
question is I guess we don't know yet
whether Anthropic can keep up beyond
this year with models that might cost
tens of billions of dollars. Right. And
I think that's sort of the un-
underappreciated
part of Google's attempt here
is that they are doing this reverse
Microsoft or the correcting their
mistakes like we talked about where they
brought Google uh Brain and DeepMind
together under one organization and
said, you know, resource you know, your
your internal conflicts be damned.
Resources are going to get be pulled
right now.
Yeah. You know, I I interviewed um
the guy who wrote The Making of the
Manhattan Project uh Richard Rhodes uh
sorry, The Making of the Atomic Bomb,
Richard Rhodes and uh he was telling me
when I interviewed him about Soviets
after Hiroshima and Nagasaki, Stalin
called his top physicists and he said,
"Comrade, you will give us the bomb. You
have the entire resources of the state
at your disposal. You will give us the
bomb uh or you and your family will be
camp dust." And
I'm sure the last sentence wasn't
uttered uh inside of Google, but that
maybe the attitude they've taken in
terms of their compute allocation might
be
much in favor of we're going to take
this seriously, we're going to invest a
bunch of computer into making this
happen. And also I think you shouldn't
ignore the fact that Google is the
company that actually has a successful
accelerator program for AI chips already
with their TPUs, which other companies
are trying but don't yet have to
replace, you know, Nvidia GPUs. Right.
Lastly, xAI, which is Elon Musk's
effort. Uh do you think that there's any
chance that they can be competitive
here? I mean, we think about resources
and they definitely have Musk's money
and they have I think there's like this
big GPU cluster that Tesla has. So I
wonder uh if if that's going to factor
and then is there any other dark horse
that might come in and start to to
matter?
I I I I think the second part of your
question is super interesting. I mean,
on on xAI, I honestly don't know. I
mean, Tesla is a separate organization
than xAI, so I don't know how much those
could transfer, but I I have no idea.
With regards to who the other actors
could be, I think people are in the case
where AI really is super powerful or
GPT-5 is amazing, I think what happens
is a lot of different countries'
national security apparatus start to
realize what a big deal this is and
they're not just going to sit around for
people to like, you know, they're
they're going to make moves and I think
what that looks like is there's a lot of
different countries in the world with a
sovereign wealth fund with a hundred
billion dollars, right? And do they all
just go around saying like, "What are we
doing sitting on this money?" Obviously,
AI is the thing to do. Each you know,
the UAE spins up uh which I hear they're
already doing a bunch of data centers in
the Middle East to start training even
bigger models and China starts deploying
uh energy infrastructure to build you
know, to do big training runs. So I
think in the world where AI continues to
get much better at a fast pace, I think
you're looking at a much more involved
the the I guess I'm trying to say the
players will be nation-state level
players cuz that's also the kind of
funding you'll need to keep scaling
these models.
And what does a nation-state do with
this technology?
Yeah, I mean, obviously the military
uses are
are clear or at least will be clear,
right? You can use it for R&D on
military stuff. I mean, just like basic
things like you have a drone operator
who's human level who can they know you
can just mass manufacture millions of
drones and a like a a human equivalent
model that's on low running locally can
run these drones and you have a million
drone swarm headed towards Beijing or
something. I don't know, that's just one
example, but you can imagine there's a
lot of things you can do.
Um
I I mean, the stepping back, the bigger
picture is why are some countries
wealthier and more powerful than other
countries? Well, it's often because they
have more people, right? So Taiwan
Taiwan would lose a war against China.
Why is that? Without the help of the US.
Well, because China has more people. Uh
if AI substitutes for people, uh can
increase the effective population of a
country, then you can imagine that it
would just be a huge leverage that a
country would have over other countries
in terms of its own economic output or
even its ability to withstand
geopolitical competition.
And you've you've referred to AI as, you
know, potentially like the last
invention. I think I might be cribbing a
little bit, but what happens if uh if a
nation-state is the one that achieves
artificial general intelligence? We're
going to talk about more more about AGI
in a bit. Uh but let's say you know, a
China is able to invent it, then what
happens? Are there applications there?
Totally. I mean, uh I I for that
particular phrase of the last invention,
I think belongs to somebody else, but
uh
Yeah, I I think it really matters who uh
wins here. I mean, if China wins, it
depends on how fast things happen. In
the world where they happen within a few
years, I think what you're looking at is
China has a ton of leverage over the
United States because
they
one of the things future AIs could
unlock is things like pocket nukes,
right? And so uh if China is ahead, they
could be like, "Listen, we've got these
uh
we we just basically got this mass army
that we'll be able to manufacture of
billions of extra soldiers um and you we
can build mass manufacture drones or
robots or whatever for them to run on."
And I think that gives them a lot of
leverage, right? Um
so I would worry about that. I think
it's important that the US win that uh
and stay ahead. So yeah, I have I don't
know the status of Chinese AI currently.
It seems like their newest model is a
Deep Seek one. It seems like it was
really good, but
Right. So before we get there, we're
going to have a lot of I mean, the
before the you know, anyone gets there,
the tech industry in particular is going
to have a lot to work through. And you
already mentioned a little bit we got it
we have data constraints. We also have
compute constraints. So I think we
should talk a little bit about more
about like whether these resource
constraints are actually going to be
things that that matter and how they
might factor in. Mark Zuckerberg spoke
with you about about how energy is going
to be something. We'll talk about that.
Um and that really opened my eyes to
sort of like, "Oh, are we going to be
like hitting a wall here with AI?" And
wrote about it in the newsletter. Sort
of an interesting question. So why don't
we just go into the component parts and
talk a little bit about each. And the
first one and I think, you know, clearly
a very important one is compute. And I
start to like raise my antennas here
when I hear rumors of Sam Altman wanting
to raise $7 trillion. And of course,
that's like an economic question, but
it's also like
if that's what it's going to take to
make this stuff work and are we ever
going to get to the place where a lot of
these people want to get to talking
about like adding more compute and data
and energy and eventually you get to the
point where you can train better large
language models and see what the scaling
law really looks like at its limit. What
do you think? Yeah. I think compute will
be less of a bottleneck than energy. As
for the $7
trillion, yeah, I I imagine
um
Well, the backing up, the reason I think
computer will be a
lesser bottleneck than energy is because
right now, you have one company, Nvidia,
which is making the sort of
GPUs and other than Google, nobody has a
clear competitor. And so the the thing
that was bottlenecking Nvidia so far is
that some of their
components that they need for these GPUs
uh CoWoS and HBM, uh they just weren't
able to get enough allocation or get uh
TSMC to build facilities for these cuz
TSMC was like, "I don't know if you buy
all this AI stuff." But uh cuz then they
had to make this huge investment into
building it out. But now it seems like
the
uh fabs are building it out. And also
all these companies have accelerator
programs where they're going to try to
ship their own chips. So I think compute
will become more and more available and
that's what Zuckerberg said on the
podcast that now with the compute
constraints are decreasing. Then the
question that Zuckerberg him pointed to
was, "Well, will there be energy?" And
the the key constraint with energy is
not necessarily
is there enough energy in the world, but
more so for training, is there enough
energy in one place? Because to do a
training run, it has to usually, at
least from what it seems like publicly
the training methods we have, you got to
do it in one place. So um if a nuclear
power plant releases 1 gigawatt of
energy, can you and if you know,
training with hundreds of thousands of
GPUs would take um uh would consume 1
gigawatt of energy, then do you have can
you like get all the energy into one
place? So, where in the America is that
place? If not in America, where where do
you go? Do you go to the Middle East? Do
you like get some aluminum refinery in
Canada cuz those consume a lot of energy
and you can just like buy out the
aluminum refinery and turn that offline
and I don't know, but you can try out
different ideas. But that that I think
energy will be the big constraint. And
you basically Zuckerberg talked about
the fact that you might need a
moderately sized nuclear power plant to
be able to to do this. And you asked,
well, what about Amazon? And he said ask
Amazon. And I asked Amazon, did a little
research, right? It actually Amazon
actually has Yeah. purchased a a nuclear
small nuclear power plant in
Pennsylvania. Correct me if I'm wrong
here. And it's 960 megawatts, so close
to that gigawatt size. And they're going
to use 40% of the energy there. I I
assume for AI training. So, is that sort
of what this energy
uh
battle looks like moving forward and is
that even enough energy given that
everybody wants to add more compute and
more data and more energy into the
process to actually be able to build
these models?
Well, it's certainly not enough. And in
fact, I brought that yeah, I brought
that up with Zuckerberg cuz he he was
like um
Anyway, so
uh
But so but
you need some in one place for the
training, it seems like, but they might
have ways to get around that. Then you
also need to deploy these models and you
those you can um
so wherever you deploy the model, it
doesn't necessarily you don't need like
a huge amount of energy in one place
necessarily, but you do need a lot of
different places that each consume
energy. That could result in the demand
for energy increasing globally. I have I
can pull this up somewhere, but I I did
some back-of-the-envelope calculations
on if you believe the scaling laws and
you believe the um
you can like just look at how much
energy does an H100 consume, then you
can look at every generation how much
cheaper
uh or how much less energy because of um
efficiency gains uh do we get in terms
of these
GPUs?
Anyways, you can just like go down the
list of and then how much basically
would it cost in terms of energy to
train a GPT-4 level model, 4.5 level
model, five, whatever.
Um
And you get into the gigawatts pretty
soon. And especially if these models are
going to be widely deployed in the
world, then yeah, it just it's going to
consume a ton of energy.
Okay. And then one last thing is is
data. And you mentioned it right at the
start, which is that data might be a a
major constraint. I mean, these
companies are already going to are
already working with synthetic data.
Like to train Llama 3, Meta used
synthetic data, like data created from
basically, you know, from AI itself. And
this is they tried to buy Simon and
Schuster or they talked about buying
Simon and Schuster and the and the
company when they like were like we
cannot get this to be as good as
ChatGPT. And by the way, this is Meta,
the company that owns Facebook, which
has like the entire social internet to
train on. So, how do we get around or
how does the industry get around that?
Yeah, I mean, the synthetic data thing
comes back to energy and compute in a
way, right? Because well, how do you
make synthetic data? You use the
existing model. What does it take to use
the existing model? It takes energy and
and compute. So,
um
In fact, it'll make training more
expensive because instead of just doing
one backward pass, you now potentially
have to do many forward passes because
at each forward pass you're going to
come up with some output, then the model
has to decide which of those outputs was
the best. Now we're going to train on
the best of those outputs. It could be
be a 5x tax on training. So,
um I mean, that's separate from the
question of are these model methods
scalable enough such that they can make
the models smarter. Um and I don't think
we have public evidence of this yet, but
I I don't know. What's your vibe on
this? Cuz when I talk to researchers at
these labs, they seem pretty confident
that this will happen. There's no
there's no evidence that um I mean,
yeah, synthetic data obviously like with
the Meta Llama 3, they said they used it
and so forth, but actually like really
making it smarter in a significant way,
I guess we don't have that much evidence
for it.
I mean, I think I'm learning as we're
talking here and sort of thinking about
it, thinking it through and being like,
okay, so just like look at the headlines
we've seen. $7 billion for compute. I
mean, of course but we might get more
efficient. Nuclear power plant for
energy. More data than the poss than the
world possibly has. And then I'm like,
how and and we're not quite sure whether
scaling these models up, right? Adding
more energy and more compute and more
data into the whatever we're training
LLMs or the industry's training LLMs, uh
you know, to to make them better. We're
not sure if that's going to work.
Uh and I'm just like, how is this
sustainable?
That's the vibe I'm getting.
Right. I mean, the thing you got to add
on top of that is what is the revenue of
these AI companies so far? And it's
actually I'm I'm guessing it's not
great, like
probably on the order of billions of
dollars cumulatively across the
industry. And
they probably want tens of I mean, I
guess Sam Altman wants $7 trillion of
winning. But you know what I mean? So,
like the the difference between how much
um So, I I think it will depend on
whether
other hyperscalers do big companies like
Amazon, Meta, Google, Microsoft buy that
this is the path to go on and investing
a lot of money into.
And it seems like they do. Maybe at some
point they stop because the models maybe
GPT-5 isn't that much better and so they
lose their patience.
And then I guess the nation-states never
get into it either.
But then the question fundamentally is
I think in the world where you can get
an AGI for $100 billion for the
training, I I just I can't see why GPT-5
wouldn't be really good and also why
people wouldn't continue investing. And
in the world where we need much better
algorithms or something, yeah, I agree.
We we might like plateau out around
here, but um you know, that goes back to
what we were saying earlier about we'll
learn a lot by the end of the year what
what trajectory we're on. Right. And I
think that what the industry seems to be
betting on is that there's going to be
more efficient models, right? Like
they'll just be able to code them up
better. And so they won't need, you
know, to take as much compute or data
for instance to improve even though they
will have to expand. And one of the
things that's consistent in your
interviews and elsewhere from the people
in the industry is that they believe
that this stuff is predictable, that the
scale is predictable. This was Sam
Altman just a couple weeks ago at
Stanford. He says, we can say right now
with a high degree of scientific
certainty GPT-5 is going to be a lot
smarter than GPT-4, GPT-6 is going to be
a lot smarter than GPT-5, and we are not
near the top of this curve. And we kind
of know what to do.
And this is not like it's going to get
better in one area. It's not like it's
not that it's always going to be better
at this eval or this subject or this
modality. It's just going to be smarter
in the general sense. And I think the
gravity of that statement is still like
underrated. Okay, so like
that seems to me to be the case why
everybody keeps putting money in to this
stuff. It's not necessarily for what it
does today, it's what it can do maybe a
couple generations from now. And that
will eventually give you the ROI. I'm
curious, I mean, you've had these
conversations with these with these
folks, you know, at at the really the
ground level of the science.
Do you do you buy this that it's so
almost going to be a linear progression
in terms of how good it is from
generation to generation?
Yeah, I mean, like one way to think
about it is you're mentioning the
scaling laws and that's a relationship
that basically as you dump more compute
in these models,
their loss gets better in a very
predictable way. And the loss in this
case corresponds to their ability to
predict internet text. Um
How that translates into capabilities is
another question, but if imagine a model
that can predict any internet text, it
can predict how to write like really
great scientific papers or whatever.
Well, then that's like, you know, it's
it's like human-level intelligence.
So, look, I mean,
uh I I think you could make the case
that there might be some break or
plateau previous to GPT-3 or something
where GPT-2 is really impressive, but
here's the kinds of things it won't be
able to do and sort of pre-register that
prediction and stick by it. It would
just be really bizarre to me that GPT-1
to GPT-2, GPT-2 is actually kind of
really interesting artifact for the
small amount of investment it took to
make it. GPT-3, couple million dollars
and you've got to like the this like
this thing that's like early stages of
intelligence, whoa. Then you get to
GPT-4 and oh my god, this is actually
useful. They can generate billions of
dollars of revenue a year.
It would just be bizarre to me that like
you're halfway through the human range
of intelligence and now it stops getting
better. So, I do sympathize with Sam's
statement in the sense of like, why
would it stop here, right? If it was
going to stop why it would it seems like
it should have stopped before it got
started getting better in a human
intelligence way at all.
So, you don't think we're going to hit
this
this stop in the road?
Uh well,
the reason that could happen is because
of the data wall, which is to say that
we can't keep training them in the same
way we've been training them before. You
GPT-2 to GPT-3 to GPT-4, you can just
dump more data and compute in these
models. If you run out of more data,
then then the question is like, well,
you know, we could have made something
smarter, but we just didn't have the
data for it.
Right. And then the operative question
becomes synthetic data or RL.
And I think the intuition there of why
that will work is
first of all, that should work better
once the models are smarter because the
sort of self-play setup is contingent on
the model being smart enough to be like,
that was the wrong way to proceed. Let's
back up and proceed in this other way
and let's learn from this. Why did I
make this mistake? Let's make sure to do
it the right way in the future. That
seems like they're getting smart enough
to be able to do that. On a per token
basis, they're actually really smart,
potentially as smart as really smart
humans. It's just that 5 minutes out,
they lose their train of thought. Can
they bootstrap themselves in a way to
like help them back up every 5 minutes
and learned how to do that so that they
can stay coherent for longer? Seems
plausible. Um and I mean, I had one of
these takes in one of my blog posts that
the way humans got better was this sort
of self-play
setup as well, right? Where we learned
language and or at least the initial
stages of language where our vocal cords
got this something called the FOXP2 gene
and then so from there it just you can
talk to other humans, you can interact
with them. That's sort of like a
self-play loop that led to
uh humans getting smarter and so forth.
Do you ever think it's weird that
there is this belief in this
predictability of improvement and yet
when you speak with the people who are
working on these projects,
they tell you that they don't really
know why it's making that improvement.
Like Dario Amodei was basically like,
"I'm not quite sure what's going on
inside these LLMs to make them as smart
as they are."
Totally. And I think that's where that's
why you should have you shouldn't be
sure that they're going to uh we're on
the track to AGI because yeah,
fundamentally we don't know what kinds
of things these are.
Um you know, it could just be I don't
know. It's it's less it's more
implausible now than it was maybe a
couple years ago, but it could be some
sort of curve-fitting thing where I
think if you ask me like
what is the reason AI progress? Like it
looking back on it, if like let's say
GPT-6 isn't that much better than GPT-4
and you had to look back on it and say
like, "Why Why did that happen?" I think
the most reason
the thing I'd expect to say is that
right now we are
we are um kind of fooled by uh how much
data these models consume, whereas you
know, they've like literally seen all of
internet text and trained that in
multiple times. And then in retrospect
you could be like, "Well, of course they
knew how to do the nearest adjacent
thing cuz it was in the data set, but
they you should have seen that they're
not that good at being creative or
novel." And so yeah, clearly they
weren't going to keep uh improving in
that way.
So we've talked about AGI a couple of
times. Uh artificial general
intelligence is this big phrase that's
thrown around a lot and uh I think that
often times people hold multiple
definitions of it in their brain at the
same time and it's definitely something
that's kind of one of the more amorphous
uh finish lines, so to speak, that
you've ever seen in the business world
that everybody seems to be working
toward it, but no one can really define
it. What is your definition of it and do
you think that we're going to reach it?
The way I've been thinking about it,
which is which is less to do with like
maybe AGI in the world and more so about
I think it's long-term impact is the
kind of model which can automate or
significantly speed up AI research. And
why do I define it that way
given the fact that there's so many
other jobs in the world?
Because
I think the one of the things you really
have to think about is once it can
automate AI research, then you can have
this sort of feedback loop where it's
helping train the next version, but it's
like looking at finding better
activation functions and like
you know,
designing architecture that has better
scaling curves, becoming up with better
synthetic data and so forth. So
uh I think once you get to that point,
then it's off to the races in the sense
of like you could you could have an
intelligence explosion, that the kind of
things that you see in sci-fi books. Um
and so that's why that's why I been
thinking about when I think in terms of
AGI, can it speed up AI research? Um and
yeah, I I I think that's like a
plausible thing within the next 5 to 10
years.
Are people working on that problem in
particular?
Hmm.
I think you just work on that by making
the model smarter in general and people
are definitely working on that.
Right.
And do you think it'll happen? Go ahead.
Go ahead.
Yeah, so I mean, the people making these
models are AI researchers themselves and
I can imagine them being selectively
trying to um Clearly they care about
their use case, which is helping them
with their job, so I can imagine the
model getting better at that than it
gets better at other things.
Fascinating. So what break What type of
breakthroughs do you think we're going
to need to get there? Right? We've
talked a little bit about reasoning and
from my understanding like the way that
models do reasoning is kind of like look
at a task instead of I mean, this is
what Jack Clark told us a couple weeks
ago. Look at a task instead of just like
spit back information, be like, "Huh,
like how many steps do I need to perform
to like really get this task right?" And
then just go step by step. Is that one
way that they're that we're going to get
there or that they'll get there or is
there something else that's going to
happen?
Yeah, I agree I agree. That definitely
seems like an important component of the
puzzle.
One one big one and this is similar is
that they aren't yet useful in long um
when you need them to kind of go do a
job. You can't be like GPT-4, "I'll be
back in a while, but can you like manage
my inbox for me in the meantime?" Or can
you go
uh go book a trip for me? You you know,
just like things that require them to
sort of autonomously hold themselves
together for a while and act as an
agent. And so just that kind of
coherency where they go from 5 minutes
of being able to be in dialogue with you
to you just like tell them to do
something, you come back a couple hours
later and they've just done a bunch of
inference to make it happen. Um I think
that will be a huge bottleneck or a huge
unlock, I mean. Yeah, I mean, this idea
of memory in the bots that like
it's something and maybe this is a
little different, but it's something
that I keep thinking about where like
I'm speaking with Claude every day and
yet every morning it's like 50 First
Dates. I have to introduce myself to
Claude again.
Yeah. Totally.
And also there's one weird thing that
Claude started doing where like I did a
podcast last week about or a couple
weeks ago about the data that you get
from voice and the emotion that you get
from voice when you can listen to
something as opposed to just have text.
Claude obviously when I get a transcript
of a podcast is only getting the text of
the voice. And so I uploaded that
conversation about the data that you get
from emotion into Claude and now it
keeps hallucinating the audio quality of
further transcripts that I've put in.
Almost as if like it wishes it
understood what the audio sounded like
because it knows that that's an
important data point. But anyway, Oh,
interesting.
aside for a moment. The memory thing is
interesting. Do you think there are easy
ways to then like have have a persistent
conversation with one of these bots or
is that going to be like another tough
problem that we won't solve for a while?
I
it could plausibly be very tough because
I don't think it's a member it's a
it's a matter of just keeping like
storing memory. I think it's like what
kind of thing are you and are you a
chatbot or are you is your persona like
I am an entity that
you know, it's it's not just about like
I'm storing these things somewhere. It's
like you have to train it to act as an
agent. And compared to just pre-training
tokens on the internet where yeah, it
knows how to complete statements, does
it know how to act as an agent? There's
not necessarily a good way to structure
that. So people have been talking about
long horizon RL, which is the training
method you need to get something like
this where you go tell it to do
something and then you reward it at the
end for having achieved that outcome,
but the difficulty with those kinds of
approaches and the difficulty with RL in
general is sparse reward and uh
non-stationary distributions, which is
like uh you know, like you failed to
book me my right the right appointment
based on like reading up my inbox and
like talking with me about it. Why did
you fail? There's like so many different
reasons you could have failed, it's hard
to attribute to any one of them. You
know what I mean? It's like hard to
learn from that. It's kind of an
interesting question, honestly, like why
humans are so good at uh learning from
these sparse rewards or making long-term
plans cuz when you look at it from an ML
perspective, it's kind of a it's a it's
a cursed sort of problem to solve.
Yep. All right. Well, I want to take a
break here and then when we come back
from the other side, uh I want to talk
about uh the Dwarkesh Podcast, uh how
you've started it and and where it's
going. Then particularly, I'm very
interested in the AI risks
uh because that's something that you're
you're focused on and it's something
that I've like dismissed uh often times
in terms of like looking at the big
risks and I've promised myself to do a
better job of taking it seriously. So
why don't we do that on the other side
of this break? Absolutely.
And we're back here on Big Technology
Podcast with Dwarkesh Patel. He's the
host of the Dwarkesh Podcast. All right.
So you recently tweeted your uh bank
account before
advertising checks from the Mark
Zuckerberg interview hit and it reminded
me of of uh a similar screenshot from my
bank account uh not not too long ago. So
you're you have checking uh negative
$17.56
and saving 10 cents. So congrats on the
savings. It it reminds me of
before the advance of the the advance of
my book hit, I was negative quite
substantially in my bank account also.
Uh and um and it's kind of this moment I
think we had similar moments where
you're like, "Okay, I think things are
going to be going on on the right
direction. Go all the way in on this on
this content plan effectively and then
trust good things will come." And they
have for you. So
apparently, I mean, and I you know, saw
the tweet and I was like, "All right,
we're definitely talking this talking
about this show." So apparently things
are heading in a good direction for you
financial financial-wise, at least
that's the sense I get from Twitter. But
I really want to know given that like it
got to that point and now you're you're
making your move, um
what is your background, Dwarkesh, and
what got you uh interested in starting a
podcast largely focused on some of the
deepest questions in AI?
Yeah. Well, I started the podcast in
college sophomore year cuz that's when
COVID hit and all my classes went on
offline and so I was super bored. Then I
just I was super into economics and
history at the time, so I invited some I
was emailing some economists and I my
first guest was actually Brian Kaplan
who's now a good friend. I asked him,
"Well, you know, I'd love to chat with
you on the the podcast." I
I didn't tell him I didn't have a
podcast. I didn't even have the name for
the podcast yet. And then he's like,
"Yeah, sure, I'll come on." And then so
we recorded an inaugural episode. And
from there I was super interested in
economics history for a while for um
through college I was mostly doing
topics like that.
And then I graduated about 2 years ago,
still kept doing it. It was honestly I
graduated a semester early, so it was a
way of basically taking a gap semester
to figure out what I wanted to do to do.
I mean I was studying computer science,
but I yeah, I wasn't sure what to do. I
didn't want to become a code monkey.
Um
In fact, I mean there's a longer story
there
but um So, yeah, and then things just
kept going well in terms of the growth
of the podcast itself and so I was like,
"Well, this seems like a thing worth
doing and investing my time into." And
it wasn't really like financially making
money, but you know, whatever. This
seems fun and I'm I'm learning a lot.
I'm meeting a lot of interesting people.
So, kept it going. Dot dot dot interview
Elias Sacks at some point and then like
other cool things happened and then dot
dot dot some more and then interviewed
Mark Zuckerberg and now I'm getting uh
checks for ads on the podcast. Yeah,
it's a great summary. And I recently
listened to you on an effective altruist
podcast.
And I thought it was a great a great
conversation. I and it sounded like the
effective altruist movement influenced
you a lot in the beginning to start the
podcast or at least to focus the podcast
on artificial intelligence. Yeah.
So,
what what did you find interesting about
the movement? Do you ascribe to the EA
theory and how present is it in your
life right now?
Yeah. Um
I definitely think they've been like
right on a lot of things at the right in
the sense of like this is a big focus
and they realized it before a lot of
other people. Like this AI stuff, right?
The the EAs have been talking about this
stuff for decades and like it's been a
big part of the movement for a long
time, right? So, you got to give them
some credit for that. They were talking
about pandemics and bio viruses and the
dangers that are posed to society from
that long before COVID. A lot of them
saw that coming. Um
I you know, I'm actually curious about
something. Listening to that podcast,
what was your sense to like uh the kinds
of things where we were buying into EA
assumptions? Did it feel like we were
buying into too many assumptions? Did it
feel like a reasonable You know, feel
free to be a red team this and be harsh,
but like what was your sense on uh for
somebody outside I'm
yeah, I'm assuming you're not
necessarily an EA. What was your sense
of like
were we assuming too many things in that
conversation? Yeah, no, I'm not EA. To
me it was I I don't I don't I don't
think I had enough grounding to be able
to um
to really answer that question well. I
thought it was interesting. Like I felt
like it was a pretty uh
I think EAs should like talk a little
bit more is my perspective. And I've
tried to reach out to many of them
especially when there was this like
6-month pause uh that was funded by open
philanthropy that call for the 6-month
pause in AI uh uh development and was
sort of like met with like a dismissive
no on that front. So, I think that's
kind of where uh some of the skepticism
comes from. Um but I will tell you this
because uh I think you saw the
screenshot that
I uploaded five of your podcasts into
Claude and was like, "All right, tell me
a little bit about what's important to
Dwarkesh." And then I was like,
what did I say? I said basically like,
"Do you think Dwarkesh is is effective
an effective altruist?" And Claude says,
um "There's a high probability that
Dwarkesh Patel is an effective altruist
or at least strongly in influenced by EA
ideas." "Based on the strong alignment
between his expressed views and EA
priorities, I would estimate the
probability is quite high, perhaps in
the range of 70 to 90%." Uh it's it's
impressive that it gave you a problem
like an actual probability number. Oh, I
asked for probability. That was
prompting. Yeah. Okay. That's that's
actually a super interesting use case of
these models to like
Mhm.
info dump into the long context a bunch
of stuff about them and like, "What are
the odds that they're like
you know, they believe a certain thing
or that's actually a really fascinating
thing to do." Yeah, look like I
I just like to not use uh
like
subscribe to labels just cuz it kind of
constrains your ability to
think. Like I
like I'm not sure what EA necessarily
means and then also um
uh
You know, like you you you
there's certain things that maybe you
traditionally considered EA that I
probably disagree with. Um
But I'm I'm definitely happy to say
like, "Look, the info movement
influenced me a lot and I think like
they've had some really interesting
ideas that I I found fruitful and
useful."
No, it's it's interesting. I feel like
they are asking some of like the really
the right questions about this
technology. Like
it is powerful and how how do you steer
it? And then it all and then, you know,
in fairness there've also been like
moments where EA has been associated
with some stuff and people have been
like, "What?" Like obviously like Sam
Bankman-Fried is not a distillation of
EA philosophy, but he was a definitely
like a firm believer and a funder.
Totally. And then um you know, with the
whole Sam Altman coup, like clearly that
was also part of it. So, I'm curious
don't know how much EA like I don't I
don't actually think EA was that that
big a deal of the board stuff. I think
like from what I've heard it was it was
related to something separate. Yeah.
Well, I'll say this. The the people who
were some of the I mean the the two
board two of the board members who I
think were in favor of the ouster were
connected in some way. Whether it was a
direct like this is an EA thing or not
is is still an open question. But that's
what I'm saying. Like there was that
presence. So, I really want to get to
like the the core of the question here
which is what are the things that you do
disagree with from EA and then um you
know, I I guess like I bring up these
examples not to impugn the movement, but
to ask you if you think there are holes
in the move in in like the broader
philosophy and they've just manifested
in these and and at least one strange
moment.
Um
I mean I could I could like go go on for
days about things I
disagree with
about them depending on like what EA
necessarily counts as. I you can like
look at their cause prioritization list.
Like it depends on what in what sense
are we disagree like what do we mean EA
in terms of like when you go to
effectivealtruism.org
and like what they say the cause
priorities are. Like it's hard to you
know, they'll say things like we care
about
the poorest people on the planet and
about animal welfare and about
existential risk. And I'm like, "Yeah,
those those all seem like good like
things to care worth caring about." Like
probably some of the most important
things in the world. Uh maybe we mean
like
the impacts they've had because of SBF
or the board stuff, right? Is it a good
sort of scalable culture? I'm actually
curious like in what sense cuz like on
the sort of like social cultural angle
there might be other things to say. So,
I'm curious which angle Yeah. you yeah.
It's a great it's a great sort of return
question because
I think you're right that it has been
something that has been a label for a
lot of different things and I don't
think there's like a charter. So, for me
when I think about it mostly I think it
more mostly in terms of like this
expected value equation where like
people should structure their lives to
uh
maximize the expected value that they're
going to have on the planet or the
expected value of their presence that
will have in terms of adding goodness to
the world.
Mhm. Yeah, I mean
there's definitely problems with that
and I I certainly don't think of like
So, here's one particular problem that I
was talking about on the podcast is that
it's hard to forecast in the case of any
individual how to make decisions using
this framework. Like I would have never
started the podcast if I was thinking
from a maximizing expected value
perspective, right? Obviously you're
going to be like, "Well, use your CS
knowledge to do something more useful.
Like you're going to spend your time
working on the podcast. Come on." Um So,
yeah, it might not be sc
it not might not be practically
practically useful in many ways. And of
course there's like the dangers of
people who think they know what the
expected value of something is and they
actually don't and just having
uncertainty over that. On the other
hand, I think like at current margins,
like how does society currently, if you
think of other charities or think your
own tax dollars, how do how are they
allocated? And wouldn't you prefer at
the current margins whether if they were
allocated using a more sort of rational
expected value framework? Like you know,
your taxes are going to be used like if
you live in California especially,
they're just
wasted on a bunch of useless Like
all these nonprofits and whatever and
like wouldn't it be better if like they
thought about like let's let's let's
pull up the spreadsheets on how much
good these these nonprofits are and
these like different institutions we're
funding with our tax dollars are doing
and
I think we that kind of mentality
actually would be partly useful in the
world.
Yeah. No, and it's definitely I I I
think it's it's interesting. It's worth
worth worth considering and especially
some of the you know, I think
I mean I don't know. I think it's worth
considering sort of the longer-term
risks of AI. Right? Which you mentioned
they were early on and I think that they
probably brought more focus to. So, on
that front I'm curious like what you
think makes
somebody I I guess it's the CEOs a lot
of the CEOs that we hear from and maybe
some of like the intellectuals in the EA
movement but elsewhere. What do you
think makes them so afraid of AI or or
so cautious about
where it can lead us?
Yeah, I think it's it's kind of a sort
of straightforward thing of
like looking up maybe a couple of years
out from the people who are just
thinking of this as a normal economic
transition where you say, "Okay, we'll
have things that are smart as humans and
as smart as the smartest humans when it
comes to science and tech."
Well, what happens if you plug this into
our basic economic growth models? You
have you know, you just have a huge
effect of population. This is the and so
there're people doing science and R&D
for you. You're like rapidly going
through the tech tree because you have
like billions of researchers.
And maybe like there're certain physical
bottlenecks to this, but
um you can you just
you just have like a trillion billions
of extra people uh helping you do
further AI research, whatever. And
there's enough of them that if they
wanted to, they could do a coup. They
have certain advantages in the sense
that you can like they can easily copy
themselves in the sense of their
weights. They can
they they can increase their population
rapidly. And uh there's like they're
harder to kill to put it that way once
they're deployed than humans are. We're
like you put out a bioweapon or a
nuclear war and a bunch of humans will
die. If these things keep a seed version
of themselves somewhere, then
uh uh you you you know, like it'd be
hard to like it's sort of if you had to
go to war with them, it's a sort of
asymmetric. Then you go from there to
"Listen, we we fundamentally as you were
saying earlier, we fundamentally don't
understand what's happening in these
models, but we know they're really
smart. And sometimes they like with the
Gemini thing, they go they do things we
didn't expect them to do or like you
know, like that was a great example
where I'm sure Google didn't want this
sort of embarrassing image to come out,
but that's just what ended up happening
at the end. Like and now imagine these
things are super integrated into like
our cybersecurity and are trained on
this long horizon RL. So they are
they're coherent agents over a long
period of time. You can like
you know, put put all that together and
it's like, "Well, that could go wrong."
Yeah, it's interesting because
for me it's always felt far-fetched
because I'm working in like the current
versions of ChatGPT and Claude.
But then if we get to this place where
these machines are effectively improving
themselves, right? Which you mentioned
like that's not
only a potentiality, that's it it seems
like a likelihood that people are that
the developers of these systems are
going to get them working on improving
them.
Then and we again, we still don't fully
know where they're going, then it seems
like that could have some unintended
consequences.
Totally, yeah, yeah. Yeah, and I I mean,
I'm still expecting a great future
because of AI. My expectation is like
the the median outcome is good. I think
we should worry about the cases where
things go really off the rails um
and do what we can to reduce the odds of
that.
Yeah.
And so what and is that like the whole
practice of alignment? Is that what
people talk about when they're like, "If
we're going to set these things going,
like we should at least align their
values to be closer to the ones we want
as humans?"
Yeah, um
it means so many things at this point
that like even I'm sometimes confused by
what exactly means. Well, one of the
goals is that it should do what the user
wants it to do unless uh the user wants
it to do something that would be hurt
other people basically. But I there's
like problems with that definition
obviously. Uh what if the um
user wants to use like in
superintelligence to make a bioweapon or
to do a coup against the government or
something.
Um
But yeah, something like basically like
we don't want the AIs to like then will
have their own drives and want to take
over or something.
And do you think that that
this is a reasonable concern? And if so,
do we have a reasonable chance of
stopping it?
I think both is a reasonable concern and
we have a reasonable chance of stopping
it. One of the things I've discussed in
my one of my recent episodes with Trent
and Enchulto is
there's these researchers who have
discovered interesting properties that
these models have in terms of how they
represent
uh their drives or their uh you know,
that their what they're thinking. And so
you can like see if they think if
they're being honest or not or if you
just reading their internals whether
they think they're being honest or not.
And as they get smarter, maybe you can
parse out fundamentally it's a bunch of
parameters, right? So it's much more
interpretable than a human brain or
something. So
maybe we can learn ways to sort of
understand what they're up to and train
them to do the right thing. We have an
advantage in the sense of like listen,
if you break the law, um
we might put you in jail or something.
But with these AI models, we can
literally change their brain if they do
something wrong. And like all their
children have changed brains as a
result. Um just like the entire lineage
has changed.
Yeah, exactly. It is and we I guess we
could shut them off. I don't know.
Yeah.
Well, hopefully. I mean, they will be
broadly deployed. So if they really go
off the rails, then I think it might be
tough.
Damn.
And so but but you do think that we're
going to end up with a positive outcome
here.
Yeah.
Contingent on people still doing a bunch
of alignment research and also these
systems being deployed in a way that
is um
you know, you don't want just like some
China just racing ahead of everybody
else and then just like doing a coup of
the whole world because they have much
more advanced AI, right? So
uh contingent and then also like we want
to make sure that the models we're
deploying uh there's they they like
serve the needs of the user and don't do
crazy things. But given that you know,
just like fundamentally more stuff, more
abundance, more prosperity, I think
that's good. Okay. As we come to a
close, I did open uh some questions up
from the Twitter folks to ask me what
they want me to ask you. So I have
actually one question that aligns a
little bit with this discussion topic
and then one that's kind of more about
the podcast. One is um how have your
political views changed if at all since
you started the show or let me even, you
know, put a different frame on that or
similar. You know, have you
I guess
you know, uh you've interviewed like
like Marc Andreessen as well who has
like very different perspectives from
some of the early AI guests. So has that
has that sort of changed the way you
think about AI at all?
I mean, generally politically I'm very
libertarian and I was probably even more
libertarian than I'm I am now. Like I
was like an anarcho-capitalist and you
know, whatever. So um or at least uh a
soft uh a soft version of that. So
uh politically, I mean, the way in which
that's changed is that I'm open to the
possibility that potentially some kind
of regulation might be useful on AI, but
I'm I still have libertarian instincts
and I'm not sure if it will be done the
right way and maybe it's better for
private companies to proceed and come up
with incentives and constraints by
themselves.
What type of regulation do you think
might be appropriate? I was talking this
week with an editor I work with and we
were like maybe regulating the way that
kids and AI can interact given, you
know, you really have no idea where
that's going to go once you put this in
the hand of a child.
Potentially, but I I think it'll
honestly be better than what they're
currently doing, which is YouTube and
TikTok and Twitter, you know, Facebook
and so forth. So
um I would prefer my kids are playing
with chatbots than they're playing with
with what they currently have access to.
I don't have kids, but if I did. Um
Uh I I think so I think in the world
where you have really fast AI progress
and you're coming up to this point we're
talking about where AIs can help improve
themselves, then I think what you want
to do is you might need a sort of
government level actor to be like, "All
right, everybody pause for a second.
Uh anybody pushes this button basically
of like helping the having the AI help
us with AI research, they could get
fundamentally better AIs than everybody
else has and
kind of take over the galaxy basically.
So before we before we let somebody do
that, we got to make sure we're in a
place where we're
ready to proceed, right? And not just
let some random person do that. So in in
that world, I think regulation makes
sense. Yeah. Then the second question we
had was uh
Someone says, "Give us the Dwarkesh
interview prep playbook. That's his
innovation and if he's able to explain
it in a way that can be replicated or at
least approximated by others, we'll have
many more interesting interviews." Okay,
I'm honestly self-interested in this as
well.
How do you do it?
Yeah.
Well,
I know it sounds like first to say this
or but I honestly just like I I I I prep
a lot and I think a problem There's also
a flywheel by doing interviews, I learn
a lot of things. And because of that, I
can get better interviews, learn more
things. I think the main flywheel
honestly is that I make the podcast
better, smarter people listen, some of
those smart people I become friends
with, and they teach me a bunch of
things. And now I can do an even better
interview.
A bunch I can get connected to a bunch
of other smart people. They teach me
more things. So I think that's like that
that's going to be a big part of the
flywheel that people may not know about.
Yeah, I've definitely had this here.
Like we talked about certain companies
and stuff and then people who listen
have reached out and been like, "There's
something you should probably know about
this." And given that I enjoy the show,
let's talk through.
Totally. It's always helpful.
Yeah, yeah. In what In what ways is
different from Do you have some trick
that I I should be aware of or like
other are there tools of trade?
No, I really think that's it. I mean,
you're going to get a great show. I
think there's a few ways you'll do it.
One is I mean, and you already know
this, but you prep like crazy, you're
not afraid to ask tough questions. And
yeah, when somebody wants to call and
talk through the topics afterwards,
take the call.
Totally. Um and then one more thing,
this is more of like a media question,
but video has been pretty big for you.
So what was the thought about doing
video because it's also expensive and
time-consuming to produce. So I'm
curious if you like had an ROI
calculation about doing video from the
beginning. You also oftentimes show up
with uh with you know, with a video I
guess a video camera,
uh and record in-person interviews. So,
talk a little bit through your strategy
there.
I I I will say for anybody doing a
podcast, I highly recommend video, and
like if you can, do it in person. Uh
I I mean, especially for me doing it
full-time, because it increases the
expected value so much, where with a
audio podcast, the discoverability is so
low that you kind of know how many
listeners you'll get, but the tail
outcome where something goes really
viral. I'll give you an example.
My most popular episode right now is
Sally Paine. It has like 800 something
thousand views on YouTube. And uh
you know, like she's she was totally
unknown before the podcast, at least by
the wider public, but episode was so
compelling that now you can make clips
of it. Now, the clips wouldn't be as
compelling if they were made of a not
in-person episode, let alone if there
was no video at all. So, you make these
amazing you make these great clips, they
bring people to the video, and then you
can have like close to a million people
watch it just because so, on the any
average video you might do in the
beginning, that might not be the case,
but you just have this asymmetric return
potentially from having that artifact.
And then as far as how to make it
better, like I would tell people to just
like honestly like watch a bunch of like
podcasts with Mr. Beast. I think he has
good advice.
Yeah. And so, do you set up those
cameras yourself, or you bring somebody
in?
I I've usually half and half about, um
but yeah, like I I got the workflow
down, I set it up, and um more recently
I've been having a friend help me. And
it's helpful. Yeah, that's sort of like
the the bag of podcaster tricks. Like
it's it's not just sitting down in front
of a microphone. We all have to learn
these days, we all have to learn sound,
we have to learn video, and figuring out
the right way
the main thing is like I I don't know if
this is your experience, but like clips
is the main thing. Like you have to
spend a ton of time uh and you got to do
all this Mr. Beast stuff of you make a
clip with the wrong first 5 seconds, and
it's not going to do well at all. But
then if you spend a bunch of time
thinking through what is like the hook
to begin with, then it could go super
viral. So, that takes up a bunch of
time, right? Definitely. Yeah, so much
time that I mean, it's it's video's like
become like part of the strategy for me,
but also it's
slow down because of the amount of work
that it takes to get into. But we also
mean we do two shows a week, so it's
like yeah, it's a question of sometimes
do the show, or do the clip, so.
Totally. Totally.
Do do is it
are are you doing the show full-time, or
uh Yeah, so Big Technology's full-time
for me. It's the show, it's the
newsletter on Substack, and then
YouTube. That's right. And then CNBC.
So.
Yeah. Yeah. Yeah. Yeah. All right. Last
question for you. Um and you're Wait,
sorry. You're full-time also, right?
That's right. Yeah.
Cool.
Yeah, it's it's a great I mean, it's a
really it's a great life if you're able
to to do it because of what we talked
about just
finding all these interesting people to
just get to spend time with and learn
from.
Yeah, 100%. All right. Last question for
you.
Out of all the interviews you've done, I
don't want to ask you your favorite, but
I want to ask you who was the most
impressive person that you've you've
spoken with. Someone who you walked away
with and said, "All right, this person
really gets it." I mean, I'm sure there
were multiple, but who's at the top of
the heap there?
Paul Schoolman, I would say. Um
he's just this really interesting person
who has
these
models about how the AI takeoff will
happen. The stuff I've been saying about
you have AI researchers and whatever, he
has just thought it out much more. I can
go through the numbers in terms of
uh
I mean, literally things down to the
level of okay, well, suppose you get
something really smart, how fast could
it do a takeoff? And then well, E. coli
can double every 20 minutes, and it has
this many moving parts inside of it, so
we have a sort of lower bound
there of you can just do that. And then
how what would it look like to convert
the entire Sahara into
solar power? And that like how many data
centers could you could make that of
that, and stuff like that.
Yeah, that's fascinating. All right, to
our guest, awesome stuff. Thank you so
much for joining.
Awesome. Thanks so much for having me.
This was fun.
All right, everybody. Thank you for
listening. We'll be back on Friday with
our show with Ranjan Roy breaking down
the week's news, and we'll see you next
time on Big Technology podcast.