Can We Save The Web From AI? — With Cloudflare CEO Matthew Prince

Channel: Alex Kantrowitz
Published at: 2025-08-13
YouTube video id: iIsXs_hugao
Source: https://www.youtube.com/watch?v=iIsXs_hugao
How will the web fight back against the
wave of generative AI that is ingesting
all the content on the internet but not
paying for it? We're joined today by
Matthew Prince. He's the CEO and
co-founder of Cloudflare and has been on
the war path attempting to write the
ship. Matthew, great to see you. Welcome
to the show.
>> Thanks for having me.
>> Can you start by giving us a sense as to
what is happening to the web with the
rise of Gender AI? talked about it a bit
on the show already, but I want to hear
it from you.
>> Yeah, absolutely. So, the the business
model fundamentally of the internet over
the last 30 years has really been driven
by search. Uh you search for something
that generates traffic. It takes you to
content that someone has created. And
then that content owner, that content
creator can drive value really in one of
three ways. They can sell the content
itself, sell subscription to it. We see
plenty of that these days. They can put
ads up against it or they can just get
the ego hit of knowing that somebody
cares about and is reading their stuff.
And that's really how the web has been
built today. It's been built chasing
that traffic. What we're seeing though
is that for the first time in history,
searches across the major search
engines, Google in particular, are
actually on the decline. And what's
replacing it is more and more people
turning to AI. And the difference with
AI is rather than giving you 10 blue
links that you click to and find the
answer, now what AI does is it tries to
give you the answer itself. And that's
meaning that people aren't going to
those original sources. And if they
don't go to the original sources, then
that means that you can't sell a
subscription anymore. You can't put ads
up against it. You don't even know that
people are actually getting value from
your stuff. And so what we're really
worried about at Cloudflare is if there
if the incentives for creating content
go away, why is anyone going to create
content in a new AIdriven future?
>> So talk a little bit about how many
pages these AI bots or search engines
have crawled in the past and how much
traffic they've delivered for each crawl
and where it's gone to today. Yeah, you
know, I think that the the the deal that
Google made with the web starting 30
years ago when Larry and Pa and and
Sergey started working on on the project
was basically let us copy your content
and in exchange will send you traffic
that again you can drive value in you
know one of those three ways uh from and
we have very reliable data at Cloudflare
going back 10 years looking just at
Google and and the the the the metric
that has stayed very consistent over
time is how much Google crawls the web.
They've actually crawled at a very
consistent rate. Uh over the last 10
years, over that same 10 years, we've
actually added two billion internet
users. So, we were at 4 billion internet
users about 10 years ago. Today, we're
at about 6 billion internet users. So,
you'd imagine it's actually gotten
easier to get traffic over that period
of time. But that's not what's happened.
Um what instead has happened is that
back in the day if you if you take sort
of take 10 years ago as the litmus test
today it's almost 10 times as hard to
get a click to get a visitor from Google
to your site. What's changed? The answer
is that Google has started providing
more answers directly on the page. So if
you search for something like when was
Cloudflare founded there will be an
answer box at the top that will say
September 27th 2010 you know is the day
that we that we launched and you don't
have to click to any link. In fact,
about 75% of queries to Google now get
answered on Google itself. And what's
changed in even just the last six months
that's accelerated this is they've
rolled out AI overviews. And we've
tracked this from region to region to
region to region. What we see is as AI
is giving you the answer without you
having to read with the original
content, the the amount of traffic that
Google is sending to these these sites
has gone down and down and down. And
that's the good news for publishers. If
Google has gotten 10 times harder to get
traffic from over the last 10 years,
Open AI is a whole different beast. In
OpenAI's case, it's 750 times harder to
get traffic than it was from Google just
10 10 10 years ago. Uh in the case of
something like Anthropic, it's 30,000
times more more difficult to get that
traffic. So why is that? The answer is I
think people are trusting the AIS.
They're reading this derivative content
and they're not going back to the
original source. But the problem is if
you're not reading that original source,
then the original sources have no way of
generating value. They can't sell
subscriptions. They can't sell ads. They
can't get the ego hit. And and that over
time is strangling the very incentives
on why content is being created. And
that's the problem that we started to
really focus on about 18 months ago. And
and then just today on July 1st, we
announced that we are um hard blocking
the AI crawlers unless they will
actually compensate content creators for
the content that they're creating.
>> Okay. And we're definitely going to get
into your technological solution. So
that's that's coming. But let's talk a
little bit more about this problem. So I
think the number that you shared
recently was anthropic will crawl
something like 60,000 pages. That's
correct.
>> For one click that's that's sent and
OpenAI was somewhere in the like 10 do
you remember 10,000 range? 1500 uh pages
now for every one click that they send
you.
>> And you know I I have to say I'm
surprised that publishers are seeing a
problem now only because these AI
products are really in their infancy.
Yeah.
>> I mean Anthropic Cloud isn't used by
very many people at all when you think
about the scope of the web. OpenAI has
500 uh million weekly active users. It's
pretty good. Uh but really nothing
compared to the amount of traffic that
you see on the web every day. And I
guess Google might might must must be
the problem. So So just explain why this
is already showing up for publishers
because this is the infancy of
generative AI.
>> Yeah, I think I think it is, but it's
one of these um C changes that we can
just see happening. So again, for the
first time in history, searches to
Google actually dropped in the last
period over period. Okay,
>> this is Google has actually reported
this and it actually came out in the
Apple uh trial as well where they're
seeing more of this traffic actually
going to other sources and so I agree
that it's it is a drop but what we're
seeing is that is the trend that is the
direction that things are heading and
even Google itself is looking more like
an AI chatbot and less like a
traditional search engine and so if
that's the case I think that the time
for publishers to panic is now if we
wait where more and more traffic gets
strangled less and less is going to it.
Again, I think that that's just going to
mean that over time we'll have more
consolidation in the media industry.
We'll have less and less content. We'll
have actually more salacious uh
headlines as people are chasing the the
content that that is is left that's out
there. And and we need to actually make
a change to make sure that we can
continue to support publishers because I
do believe the future of the web is
going to be an AIdriven future, not a
searchdriven future. and that AIdriven
future just doesn't have the same
incentives and doesn't support the same
business model that the old searchdriven
web did.
>> Okay, I'm going to poke at this a little
more.
>> Yeah,
>> you mentioned that you can now search
Google when was Cloudfare funded founded
and you'll get the answer.
>> That's something that Google's been
doing for a long time. You could ask
like when was Martin Luther King Jr.'s
birthday? Even before generative AI, uh
they were giving you these answers. So,
is it that the the magnitude has
changed? And and if so, from the
standpoint of a consumer, could this be
good? I mean, it's pretty annoying to uh
type this question into Google, when was
Cloudfare uh uh founded, and then have
to click to Cloudflare's website to get
the answer that Google could just
surface for you. And so much of the web
has sort of become effectively this
service of Google queries
>> where websites don't really need to
exist.
>> Well, you know, so absolutely this has
been happening for a while. And if you
look at up until 6 months ago, the ratio
10 years ago of crawls from Google to
clicks was two crawls, one click. Um 6
months ago, it was up to six crawls, one
click. And that's all because of the
answer box. Um what the AI overviews,
which they've rolled out over that time,
have done is they've taken it now to 18
crawls to one click. So yes, it it is a
situation of, you know, the frog boiling
in in water, but that's it has gotten
progressively worse. And I think across
the media industry, it's gotten harder
and harder to actually survive as a as a
publisher. And so what I worry about is
yeah, you know, publishers are
struggling at they were struggling at
6:1, they're struggling at 18 to1. I
think they're dead at 250 or or 1500 to
one that we're seeing with OpenAI and
completely dead at 60,000 to one we're
seeing with something like Anthropic.
And so that is the direction that things
are going and that's a challenge. I
think you're exactly right on the other
point as well that the the the challenge
here is that this is actually a better
user experience. That's why more of the
web is going to turn to AI. Um it is
great that you can type something in and
you can get back an actual response as
opposed to having to hunt for it
yourself. That's a better user
interface. And so that absolutely is is
is going that direction. I'm not arguing
and I don't think anyone is arguing that
we should just get rid of AI or that we
should go back to sort of 10 blue links
on Google. What I am saying though is
that the fuel that runs all of these AI
systems. The reason that Google can tell
you when Cloudflare was was started or
what Martin Luther King's birthday was
or something like that is because
somebody is doing the work of that
original content creation. That original
content is the fuel that fuels Google.
it fuels all of the AI companies. And if
we strangle off the business model of
those places, if we strangle off the
incentives for content to create uh for
content creators to create content, then
we're actually going to end up
strangling the AI systems as well
because there's if there's no content to
train on, then the AI systems are going
to be pretty stupid uh for for that. And
so I think everybody agrees that there
has to be incentives that allow content
creators to continue to be compensated.
The question is what does that incentive
structure look like? And that's again
what we've been really spending a lot of
time looking trying to figure out.
>> Ken, I just want to ask you one more
question about crawls.
>> Y
>> um I think that sometimes you know you
crawl to like put a a in search you
would crawl to put a website in your
search engine
>> or a page in your search engine from a
website. Are these generative AI bots
crawling to do something similar just to
surface the information from these pages
or is some of the crawling being done in
service of training their models?
Because if that's the case, it's
actually not as big of a deal uh because
it's just being fed into training. I
think the problem is taking that like
direct query to answer uh uh behavior
and sort of bringing it into the search
engine. So do you know is it training or
is it just surfacing answers? So I think
there there are two different parts of
this. There's there's definitely
training and then there's what is closer
to a search like experience. If you're
if you're familiar with this, it would
be something like rag or something where
you're actually uh getting that
real-time data in order to augment the
foundational model. I think in both
cases though, you're actually costing
the content creator something. Um there
there is literally they're paying for
that traffic. They're paying for that
load that the crawlers are are pulling
off of it. They're also it is the
intellectual property, it's the data,
it's the content of these providers that
they're using to train the models. And
so there's value that the AI companies
are getting. If if there weren't, they
wouldn't be crawling, right? But there's
no return of any compensation or any
reward. Again, in the old days of
Google, the trade-off was let us copy
your content and in exchange we'll give
you traffic. What has happened is the
frog is boiled in water and now everyone
is saying let us copy your content and
we will give you nothing in return. And
so what we're saying is simply we need a
better deal a deal for a new AIdriven
future. And that should say if you are
getting value from the thing that I
created then you should compensate me in
some way for it. And it may be tiny
amounts of money but at some scale that
actually turns into something that can
allow a content creator to continue to
have an incentive to create content over
time. If we don't do that, if we don't
give content creators the incentives to
create content, they'll stop creating
content. So, I think you're bringing up
a key point here, which is if people are
like, well, you know, I'm not
necessarily seeing publisher content
show up every time I'm on an LLM, what
you are seeing sometimes is the product
of the publisher that's been used for
training. And even if it's like under
fair use, totally fine because it's
being transformed and, you know,
something crawled from big technology or
the New York Times is now being used to,
you know, help basically because they're
just trying to figure out what word
comes next in English language, you
know, give you an answer about summer
camp. Um, the publishers are actually
enabling that and u and and every time
an AI crawler hits a publisher website,
they have to pay. And I do you work with
Wikipedia because they've been loud
about this that like the server costs
that they have to pay have increased
exponentially but those aren't human
visitors they are AI bots crawling
Wikipedia talk about so there's there is
a real cost to just supporting this
crawl and before we even talk about
intellectual property before we talk
about anything else like the the content
creators the publishers are having to
bear that cost and so at a just a simple
fairness level like why Should they be
bearing the cost in order to train, you
know, these multi-billion dollar, you
know, AI companies that are out there?
There should be some some value which is
given back. But I think it's even beyond
that. I I don't even think we have to
get to I mean, you used legal terms like
fair use. And I think that's very much
up in the air right now. We literally
had two different California cases that
came out on both sides of that issue. Is
training on content fair use or not? And
I think it's going to be a coin flip
where different courts are going to say
different things. And I don't think it's
a clear answer there, but I think it's a
more fundamental thing, which is if
you're doing something to create value,
you should be getting some sort of of
compensation for that. If if somebody
else is is imposing a cost on you, you
should be able to charge them to offset
some of that cost. And if something's
not if they're if if someone's not
willing to pay that, then they shouldn't
be taking your content in the first
place. up until now and and everyone's
focused, you know, the New York Times is
suing and I mean a bunch of people are
doing it. Everyone's focused on the
legal issue. I actually think that
before we even get to the legal issue,
the first step is to actually take the
technical steps to give content creators
back control over the content that
they're creating and let them have the
choice on do I want to give access or
not? Do I want to charge for this or
not? And then done correctly, there
should be a marketplace where content
creators and AI companies come together
and say, "Hey, I created this piece of
content. We I think it's super
valuable." And the AI company says,
"Yeah, maybe it is or maybe it's not,
but here's what we're willing to pay."
And maybe they meet the clearing price.
Maybe they don't meet the clearing
price. But that marketplace needs to
exist because otherwise there's no way
to convey value. There's no way to
derive value from content creation. And
again, I just need to hammer this point
home. If we don't give content creators
an incentive to create content, they'll
stop creating content.
>> And it sounds like, by the way, so
you're not a skeptic of the AI
technology. You believe that this AI
generative AI thing is going to work.
>> Yeah. Not only that, I mean, it is it is
already clear that it's going to be the
interface of the future of the web. So,
we're going to move from what has been
the dominant interface of the future of
the past of the web, which was search,
to what the interface of the future of
the web was going to be, which is very
much going to be AI. So I I believe I'm
I am I I believe AI is going to get
better and better and better. I actually
think that um done correctly um content
can be created in such a way that will
make AI better and that you can create
incentives for for doing that. Um what I
worry about is in order for AI to get
better, you have to have original
content. People have to be going out and
creating that. And right now we're
strangling off all of the incentives for
that content creation, which not only
hurts content creators, it will
ultimately hurt the AI companies as
well. So, I was speaking with someone
who does uh who works in data uh
labeling or data creation for large
language models last night. In
anticipation of this conversation, I was
like, you know, one day what you're
doing might look almost exactly like
what web publishers are doing where like
you might be hiring PhDs uh and having
them like write their uh information and
you you know feed that right into a um
into an LLM's training set. And there
might be let's say historians. So if you
take like a world history website, the
historians that are writing the web
pages for that world history website,
they must be just like maybe one day
they're going to be writing those world
history articles and instead of
publishing them to the web, selling them
or feeding them right into chat GPT.
>> Do we do we lose anything if the web
goes away and it's just content creators
selling stuff to large language models?
Yeah, you know, I think the black mirror
kind of dystopian future is is not that,
you know, content will stop being
created and journalists will stop
existing and researchers will stop
existing. I think the black mirror
future is that we actually go back to
something like the time of the metache
where we have maybe five big AI
companies and they each employ a set of
journalists and a set of researchers and
a set of set of folks that they become
effectively the institutions of
knowledge and they they have a they have
salaries for all their their academics
that are on staff. They probably each
have different you know maybe one of
them is the conservative AI company and
one of them is a liberal AI company. You
can again you can very much see that has
actually been the natural state of media
and the natural state of controlling
information for quite some time and you
could imagine that all of that research
actually consolidates behind each
individual AI company and every
different academic out there is just a
is just is is basically an employee of
open AI or anthropic or Google or
Microsoft. I think that's a pretty bad
outcome. Um because again I think that
we the web has been so amazing at
distributing and democratizing access to
information that I think we want to
create that incentive and so I I think
what we're trying to do is say what's
the step you know a few steps before the
sort of you know all the academics are
employed by one of the AI companies and
I think the answer is you allow the AI
companies to pay for the content that is
actually valuable to them that fills in
their models and makes their models
better and then you create incentives
for independent journalists, independent
researchers to actually be able to
create that content to augment those AIs
while still, you know, being valuable.
My my this won't happen, but my sort of
uh, you know, optimistic version of the
future is humans should get content for
free again because we've kind of
paywalled way too much, frankly. and and
robots should pay a ton for it because
again every time a robot ingests
something it's in service of hundreds of
thousands if not millions of different
humans. So robots should pay for that
content. We should get back to a place
where then humans get that for free.
Again that's I I I think it's going to
be hard for us to get there. But that's
the again the future that I think is is
actually the kind of optimal future. So,
someone hearing you and and looking at
this through a critical lens might say,
"Look, Matthew, uh, publishers depending
on web traffic are barking up the wrong
tree." That selling eyeballs for CPM
fractions has not been a good business
for a long time. In fact, we had a guest
on the show that recently said, "Listen,
like I'm he's a journalist, but he's
like, "If I thought that traffic was the
way to go, uh, I'd be out of business a
long time ago, and what you really need
is an audience that will, let's say,
subscribe to your newsletter or listen
to your podcast, uh, maybe come to your
events, and we've already moved past
this business model of trading traffic
for dollars, in which case this isn't an
existential threat." What would you say
to that?
>> I think I mean, even then, you're still
trading traffic for dollars. you're just
trading it for subscription dollars, not
ad dollars. That will go away as well
because what will happen is the AI
company will ingest the podcast and then
summarize it on on their page. And why
would they ever buy a subscription to
your podcast? Why would they ever sign
up for your newsletter if they're AI
agent can just simply say, tell me
everything that was relevant in this
particular podcast or newsletter?
>> Because there's an experience of
listening that's enjoyable. People do
that in some part for for yeah the
entertainment and the leisure value I
think that's how they learn.
>> I think the AI companies will do a very
good job at creating that experience as
well.
>> So you think they'll just create like
competing
>> Oh absolutely for sure.
>> I used to think that this was like such
a pie in the sky and lunatic idea until
I listen to Notebook LM.
>> Yeah.
>> And like we've had multiple people on my
YouTube page be like did you license
your voice to Notebook LM? And I'm like
no but the fact that you're saying that
is pretty concerning.
>> Totally. And I again, I think that's the
inevitable future. We're going to want
to have hyper customized podcasts that
are in exactly the voice that we find
the most reassuring. And AIs are going
to create that for us. And again,
they're going to be fed by original
content creators that are out there that
give them the ideas, give them what what
to talk about, give them the news of the
day. What what I think is we have to
move even past the business model of
subscriptions. We've got to get to
something else where you as a content
creator are being compensated for the
content. The the way I think about it is
every one of these LLMs is a little bit
like a block of Swiss cheese. Um they've
got, you know, a lot of stuff there, but
there are big holes that are in it and
content that is value for valuable for
them are the ones where they actually
fill in those holes in the Swiss cheese.
And so what I would imagine in the
future is that you're able to actually
surface what are the places where there
are holes in the Swiss cheese as as an
AI and then allow content creators to
create content that fills that in. My my
favorite example of this is I was I was
in Stockholm a couple weeks ago meeting
with Daniel A because there really is
nobody who has done more to compensate
creators at scale than Daniel. Daniel
Dan was founder of Spotify and and they
they've done a just a amazing job at at
at doing this. And he told me he told me
a story in a long conversation. and he
said, you know, one of the things that
we do at Spotify is we actually take the
searches that people run at Spotify, you
know, that are things like, I want a
song with a reggae beat about how much
it, you know, sucks when your, you know,
your sister runs away with your car
>> and has happened.
>> Yeah. Or whatever. And it turns out that
they don't have good things to fill that
in. There are content creators out there
that are making tens of millions of
dollars a year just creating content for
those searches that don't have good
results right now because Spotify
surfaces that list of things where they
don't they don't give those results. I
think that that's actually beautiful. I
think that's actually really amazing
where they are showing where is there
something that there is human need for
and then how can we actually then um
create content to fill that human need
and then monetize it you know through
what they're what they're doing. I think
the same opportunity exists in the AI
space where these AIs actually are able
to say this is a p I can tell you how
valuable this new piece of content is
for me and they can and you can rank it
and then that allows you to create a
marketplace where they can say listen
that new piece of information is so
valuable that I'm willing to pay you for
that. And I think that done correctly
that then gets us to more original
content creation. And it gets us to less
sort of meto copycat style uh
journalism. Same thing in in research.
It gets us to maybe a place where we're
we are doing original research and
getting rewarded for being more original
as opposed to being more salacious.
>> Yeah. It's interesting. YouTube has a
similar thing where there's an insights
or an inspiration tab and they give you
like the title, the description, and the
thumbnail and they're like people are
searching for this. Go out and make it.
>> Yeah, that's exactly right. And I think
that that's actually a incredibly
valuable thing that that's making
humanity better as opposed to, you know,
yet another story that's just chasing,
you know, the most salacious headline
that you that you can get.
>> So you're talking about this idea where
publishers might sell their the the
ability to crawl. Yep.
>> To AIs. Um that is also assuming that
content is scarce. And so I want to run
this other idea by you, which is that if
we had the same amount of content that
we have today, that's a great idea. But
what we're seeing now is this explosion
of content creation that's made through
generative AI. Like it's it's kind of
funny like every time you see like these
suggestions that we're talking about,
YouTube's making these suggestions
because clearly there's traffic to be
had. I'm sure there are already
YouTubers today that are feeding that
into chat GPT, spitting out a script,
running that through V3 and Google, and
then posting the videos and cashing in
on traffic. So, there's just going to be
and we're in the middle, I believe, of
this explosion of content. Actually, you
probably have better data on that than
my suppositions. It almost feels like a
dust of the web where like you know if
there if the ability to create content
is constrained by a human's ability to
create content then you have something
to bring to these AI companies. But if
human plusbot content starts to become
the norm there's going to be so much
then even if you're creating highquality
stuff it's not going to matter very much
to these generative AI companies. What
do you think about that?
>> I I think it's still I so first of all I
think that there's the pure AI generated
content. There's lots of research that
shows that training AI on AI data uh is
sort of like that old Michael Keaton
film Multiplicity where basically every
copy of something gets worse and worse
and worse and and again that that feels
like that's going to still be still be
the case for quite some time. May in the
future um robots be able to go out and
do you know interesting reporting from
the field? May they be able to do you
know interesting research? For sure. But
today um I I I think that that
interesting research, that interesting
original content, that interesting
insight that that comes from the work
that right now only journalists and
researchers and others can do is still
the most important thing for filling in
those gaps in the in the Swiss cheese of
of AIS. What is just again high volume
lowvalue content? My hunch is that
that's if we score it correctly going to
be exactly what it what it is, which is
low value content. And so it should it
should be rewarded very minimally. Um I
I I I like to ski. So I I live part of
the year in Park City, Utah. You're in
the right place. I I Yeah, I care I care
enormously about the snow forecast.
There is a a uh a forecaster uh in Utah
named Evan Theer. He he he writes these
incredibly precise weather forecast
where he will literally tell you it's
going to snow this much on this run and
this much on this run. And again, I
actually pay for his content because
that's super value for me. I am going to
be more willing in the future to pay for
an AI that has actually licensed Evans
content back from him than I would to
pay for an AI that doesn't have that
content because again that content is
going to be, you know, super useful and
unique and valuable to me. And so I
think actually what it will do is we as
we have um more AI systems that are out
there is it will cause you to look for
more original creative content and
that's going to be the thing that the
AIs are going to be the most willing to
pay for and that again I think is
actually a beautiful thing where we're
we are instead of creating incentives to
create more and more salacious headlines
and chase traffic we're creating
incentives to create knowledge that
fills in those places in sort of the
Swiss cheese where there might holes.
Taken in aggregate, all of the AIs are
probably a pretty good representation of
what human knowledge looks like. And so,
if we can score them and say, "Okay,
here are the gaps in human knowledge and
and here are the places we need to fill
in," that actually gives a a really rich
place for creators to look to create
content which advances human knowledge.
>> So, you know, DeepMind is working on
weather forecasting right now. Uh, this
example that you gave of Evan Theer, the
forecaster in Utah. Are we that far away
from just telling an AI, hey, like
you're you're tapping into the deep mind
model and weather forecast. I want to
ski this route today. What's happening?
>> I think we're probably pretty far away
from that. But um and and but but again,
I think and Evan is going to be always
better using the tools of AI plus his
local knowledge to make this better. AI
just becomes a tool that creative people
use in order to tell stories better, get
better information, do more research,
and and again, I I am skeptical that in
the in the short term at least that
we're going to have um real value that
is created by by training on purely
generated content.
>> Okay. So, we've talked about your
solution. Let's dive into the
technological side of it a little bit.
We are a tech podcast, so we should do
that. So, Cloudflare security company
helps websites stay up on the web
despite all the threats.
>> Yep.
>> Um, and let's just at the very beginning
kind of talk about like the threats that
you see to websites, who's trying to
take them down?
>> Yeah.
>> What h what's happening on that?
>> Yeah. So, I mean, we protecting websites
is is part of our business. So, is um
protecting employees as they go out
across the internet. So we we Cloudflare
is fundamentally kind of a network that
is built with all the performance,
reliability, security, availability and
privacy guarantees that frankly the
internet should have been built with had
we all known what it was going to
become. But but obviously back in the
60s7s and ' 80s when we were laying down
all these protocols, we didn't think
about those things. And so Cloudflare is
basically reverse engineering uh the
internet in order to give it those
performance, availability, uh security,
reliability and um uh and and uh and
privacy guarantees on top of of what
what is there. And so today, one of the
main uses for Cloudflare would be to you
have you're putting a website or a web
application or anything online, you want
to make sure that it's safe from
different sorts of threats. And so what
are the threats that we see? I mean
every day we go to war with the Chinese
government, the Russian government, the
North Koreans. I mean, everyone is
trying to hack into our customers
because who are our customers? Some of
the largest banks in the world, some of
the largest governments in the world.
And they are all constantly under threat
and constantly under attack from these
these organizations. The media companies
actually were a pretty small part of our
business. We had some media companies uh
that used us, but it wasn't a big piece
of it. What what happened starting
really 18 months ago is that those
companies said, "Hey, I know we hired
you in order to stop the Chinese
hackers, but we have this new threat
that's there." And and frankly, my
initial reaction was
publishers, they're always whining about
the next new technology, like what's
what's going on? And it and and they and
over and over they said just pull the
data, pull the data, pull the data. And
it was only when we actually saw the
data and we saw how AI companies were
taking content without giving anything
of value in return that they were
actually adding enormous amounts of load
and in some cases taking whole websites
down because of the amount of traffic
that they were sending to it,
>> right? They they basically ddos the
websites.
>> Dust the websites, you know, not
intentionally. But but that was the
point at which we said, listen, maybe
there is something that we can do here.
And and you know, at at first I think a
lot of the publishers were saying, "Oh,
this is this is so hard. There's no way
we can stop it. You know, there are
these these nerds and they live in Palo
Alto and they're so smart and what are
we ever going to possibly do about it?"
And um a and I just kept saying, "Guys,
we we go to war with the like the
Chinese hackers. Like we can stop some
nerds with the CC Corporation." And I
think it took a while for that message
to really get through. But now that it
has, you know, it was it was it's been
really rewarding to see that the vast
majority of the world's publishers,
major publishers have said this is we
need to change the model. We need to be
compensated for our content and
Cloudflare has the right idea in terms
of the technical solution to do that.
>> By the way, folks, $60 billion company
listed publicly. Um, so it's uh it's one
of the bigger cyber security companies
on the New York Stock Exchange. Um, but
I want to ask you, okay, so we're going
to get into this technological solution,
but what you said is interesting because
what if these AI bots, do you ever think
there's a world where these AI bots
ingest not just the publishers, but the
banking websites as well? Like, are you
like a natural enemy to having
everything go through that? Because if
everything goes through ChatGpt, then
these other sites that you secure might
not need your services. Um I I think I
mean they still there there's going to
be some um gatekeeper for how agents and
other things access various services
online. And I think the the challenges
in each of those cases are different. Um
in the case of a bank, you might want to
say I want to have guard rails that are
in place. I want to make sure that this
is actually a customer that's accessing
account. I want to make sure that
they're, you know, they they can they
can only conduct transactions that have
been authorized by an actual human being
or or something like that. Kle actually
provides those guardrails and and and
makes it so that a bank can say I want
to expose my infrastructure to AI but do
it in a way which is safe and secure. I
think publishers have a different
different challenge and so you know in
our case we're a way of thinking this is
like we have a whole bunch of developer
documents which are on our website. We
want those to be an AI. We want coding
platforms when someone says oh I want to
use Cloudflare to you know build X Y or
Z for it to be able to spit that out.
What we've done is we've actually tried
to identify with with real narrow
precision what are those pages uh that
are on the web that have some some
indication that they are uh going to be
monetized and and generally that is look
at is it behind a payw wall or is it
does it have some sort of an ad unit on
it like a a banner ad or or or some sort
of ad that's there. If we detect that,
then we're blocking it by default. But
we're not doing this. Again, there's
value for AI and we want to make sure
that AI is actually getting the data
that people want to have have in it. So
like the about us page on the New York
Times probably should go into the AI
system, but a brand new article, you
know, with breaking news probably should
be restricted and again unless the AI
company's actually paying for that
content.
>> I guess the way I want to ask it is if
everything goes into chat GPT, what's
less what's left for you to protect?
thinking thinking outside of the media
world.
>> Uh well again I think that 80% of the AI
companies are are customers of ours and
so so we we protect them as well.
>> Yeah. Okay. Sounds good. Just wanted to
ask that. I was curious about it. But
let's talk Okay. Now so you're going to
build a a technological solution that
will block crawling.
>> Yes.
>> And so robots txt which is this code
that you put in like the header of your
of your site if you don't want to be
crawled. That wasn't working.
>> Yeah. I mean I think robots.txt txt has
two problems. Um the first is some
people just ignore it. Uh and so if you
ignore it then you can still crawl all
you want and there's some just there's
some even some big legitimate companies
that completely ignore robots.txt and
we're really good at basically being
able to say okay here's what robots txt
says. How are you are you actually
following what those those uh what what
sort of the rules of the road are? And
if the answer is yes, then robots.txt TX
is a great solution. Um, but in the
cases where somebody is ignoring it,
then we need to actually put in place
additional technical barriers to
restrict their their their access. And
so that's exactly what we're doing. The
second problem with robust CSC txt is
it's not granular enough. So take the
Google bot for example. Google's crawler
does five different things at least. Uh,
one is it checks if you have an ad on a
page. It makes sure that if you're
putting an ad for Proctor and Gam a
Proctor and Gamble product up, it's not
against a pornographic site or or
something like that. So, it does brand
safety checks. Um, the second thing that
it does is crawls to index for
traditional search, the 10 blue links
that are out there. The third is that it
crawls to create answers that are in the
answer box. The fourth is that it crawls
to create answers that are in the AI
overview, the newer thing that they've
rolled out. And the fifth is that it
crawls in order to ingest content in
order to put it into Gemini.
>> It's a lot of crawling.
>> A lot of crawling all through one
crawler. And for lots of different
reasons, they don't want to split that
out into into various crawlers. But
right now, they basically make you have
a choice. They say you can either block
Google entirely, in which case you can't
run ads, you don't appear in search, but
you don't appear in the AI overviews or
Gemini or other things. or they've
recently added a tiny flag which which
basically just says I'm not going to use
this data just for the Gemini piece but
you still appear in AI overviews you
still appear in answer box we think
there needs to be more granularity where
there is a difference between taking
content and transforming it and a
license should say you can't do that
without my permission uh versus just
taking that content in order to do brand
safety checks taking that content in
order to do traditional search and so
what we've proposed and we're working
with the IET TF as well as regulators is
extensions to robots.txt to give it that
granularity and that actually then
allows us to further test to watch you
know if does this robot behave in an
appropriate way and if the answer is yes
then maybe it gets more permissions to
do things online. If the answer is no
then we will put more restrictions and
blockades in place to stop what are
again badly behaving robots.
>> So what you're going to do now is in
addition to that put a wall up
technological wall.
>> That's right. No crawling. Sorry, enough
of you haven't respected robots txt. No
entrance.
>> That's right. And so the the original
like we're we're all familiar with like
404 errors when something is not found.
Uh success on the internet is a 200 uh
response that comes back to you. There's
actually an original uh um one of the
the protocols set out a 402 response and
that response says payment required. And
so we're actually tapping into that
exact original specification to say when
a robot tries to access access a page
where there's an intent to monetize it.
So, it's either behind a subscription or
it's got ads on it that there is an
ability for us to say 402 payments
required and then there's a negotiation
in in some cases and and at first that's
going to be largely large publishers
with large AI companies doing deals like
what Reddit has done or what the New
York Times has done or what others have
done where they have licensed the
content and then certain robots get
access to that. But in other cases and I
think over time that will be a dynamic
process where maybe a smaller AI company
or a smaller publisher will say hey
here's what I would charge for this
content. Cloudflare will surface like
how valuable that content would be for
that particular AI and then the AI
companies can decide is that worth it or
not and it might be you know a a very
small transaction maybe a fraction of a
penny or maybe a few cents or in some
cases content that is really valuable
might be worth hundreds or thousands or
millions of dollars. You could imagine
Taylor Swift, you know, is about to
release a brand new song and the lyrics
uh get published. How valuable is that
for an app for teen girls who are lonely
and want to talk about things? Probably
pretty valuable and especially valuable
if you could have exclusive access to it
for some window of time. And so that's
the sort of thing where I think a
marketplace over time can develop where
original valuable content will get
compensated and there will be a clearing
price in the market once we have that
scarcity that's created by that wall.
>> Okay. So it's not just a blocker. It's
also this marketplace where you're going
to have publishers that will
>> sell their content. So that's a way
where you could have uh useful effective
chat bots and potentially a flourishing
web.
>> Exactly. And that's and that I think is
what we're trying to play for. Again, my
utopian vision of the future is robots
should pay a lot for content and humans
should get it for free,
>> right? And so to kick this off on June
3rd, as the day turned July 1st, uh you
had a party uh on on the top of the
World Trade One World Trade Center um
where a bunch of publishers pressed a
red button to get this thing going. And
that includes some very big names.
Condinast, time, the associated press,
the Atlantic, AdWeek, and Fortune are
all going to be part of this
>> and and and a lot more. Frank, frankly,
there there hasn't been a publisher that
we've talked to who hasn't said that
this is a change that needs to happen.
You're on the right path uh for it. And
so across the board, not only the kind
of 20% plus of the web that sits behind
Cloudflare already, but I think another
20 to 30% that are these major
publishers that are out there are all on
board and doing that. And what I think
has been encouraging is at the same time
we've been having conversations with the
largest AI companies and all of them
agree that content creators need to be
compensated for their content. They all
agree on that. the devil's in the
details and some of them are pushing
back in various ways. But I've been
really encouraged that as we've talked
to the largest leading AI companies, the
largest technology companies in the
world, they're actually leaning into
this. They all recognize that compens
content creators need to be compensated.
And I think over the months to come,
that's when the hard work will be go
down around how do we actually create
this marketplace in a way which is fair
for all of the different uh providers in
the in the ecosystem. Treats everybody
um in a way that has a level playing
field, still allows new entrance,
doesn't just reward the largest
companies with the biggest budgets that
are out there. Um make sure that you
know legacy providers like Google are
treated the same as you know newer
providers that are there. That's all
going to be really tough, but I am
incredibly encouraged by the conver
conversations I'm having, not just with
the publishers who are all on board, but
actually with the AI companies who
recognize that something needs to
change. That's interesting that they're
they're recognizing this u because the
sense that you get is they you hear
these announcements of deals like OpenAI
paying X million to W the Wall Street
Journal to be able to include their
articles or Dow Jones u and the sense
you get is that they're just kind of
payoffs to not get sued like the Sam
Alman very happy uh very clearly is not
happy with uh the New York Times uh
pursuing OpenAI and especially the the
actions that the Times are taking in
their lawsuit like forcing OpenAI to
preserve their chat logs which I think
is wrong. Uh but but it is interesting.
So what do you think about is there are
we going to see an evolution from these
one-off deals to this marketplace style
world?
>> Well, I think that I mean we've seen
this story many times before. I mean
Napster was along it was a a wild west.
There was a bunch of lawsuits from you
know the publishing me the music
industry uh targeting Napster and the
like. And then along comes iTunes, which
starts out as 99 cents a song, but
eventually evolves into what is much
more uh much closer to a Spotify model
of a subscription and a pool of funds
that then get distributed out to all the
creators. So, so I think we've seen this
story before. Um, and I think that one
of the things that's really important is
that that OpenAI and others are willing
to pay for content. They do the deals
that are there. And I don't think it's
right to just say we'll do a deal to
avoid avoid lawsuits. Again, I think
that that when you talk to leading AI
companies, they understand that people
are doing the work to get create
content, they need to get compensated
for that content. And if it's not going
to be through subscriptions or ads or
ego, it's got to be through something
else. And so, exactly how that happens,
we'll figure out. But what I know won't
work is if OpenAI is paying for your
content, but you're giving it away for
free to everyone else.
>> That's not going to work.
>> Open AAI eventually is like, listen, we
want we want to support you. We want to
help you out, but we can't be the
suckers. We can't be the only ones
paying where you're giving stuff away
for free. And so scarcity is needed in
order to actually have value in any kind
of market. And so I think that the the
people who have actually leaned into
this the most heavily are the ones that
have the existing deals with some but
not all of the AI companies because they
realize that for those deals to be
valuable for them to renew for them to
renew for more there has to actually be
scarcity where they're getting something
of value. You can't you can't charge
open AI but give it away for free to
anthropic. Something needs to actually
restrict it and say everyone needs to
pay. Everyone needs to be on a on a
level playing field and figure out what
that that looks like going forward.
>> Could there be some collateral damage uh
with the solution like you're
implementing? For instance, I'm looking
at the names of these publications,
Condas Time, the AP, The Atlantic. I
imagine they get a lot of traffic from
search as it is today. So, if you put
this blocker up, does that impact their
SEO for instance?
>> Yeah. So, we've been very very careful
to say that the traditional search today
is not blocked. um and and even AIdriven
search today isn't blocked, but you're
going to see us give publishers the
tools to differentiate between search
indexing uh and derivative content. So,
the way I would think about this is the
Google experience today. It may be that
a publisher says, I still want to appear
in the 10 blue links, but I don't want
to be in the AI overview or the answer
box. And the granularity of being able
to say, okay, Google, I understand you
use one bot, but we need that to be
treated um similarly. And again, I am
hopeful and in my conversations with
Google, um, I I I am increasingly
hopeful that they understand the
importance of this and giving that
granularity. But if for some reason they
don't, I I I am also 100% certain that
regulators are paying a ton of attention
to this and that around the world you
will see them force Google to split
their crawler out into into announcing
exactly what it is doing. Again, I I I I
think that that's kind of the hope
hopefully we get to an agreement with
Google way before that has to happen.
Um, but but that's inevitably I think
Google is going to have to say, you
know, if you don't want us to use your
content for derivatives, you have a way
of of of controlling that while still
appearing in search.
>> Okay, a couple big picture questions
before we leave.
>> How much bigger is the web getting and
is the web sort of accelerating in the
size increases that we see? Yeah, I mean
it's it's actually been um by by all the
measures that we can see it's actually
kind of plateaued and has actually
flattened out in terms of in terms of uh
in terms of content. You see fewer
domains getting registered, you see
fewer new websites uh going online. I
think a lot of that has moved to
individual platforms. So more of that on
a YouTube, more of that on a Facebook,
more of that on a a Tik Tok um that that
is there. And I think part of that is
because those tools have provided
content creators easy monetization tools
uh to allow them to to to not have to
think about some of those problems. I
think that in in a in an ideal future,
you would want content creators to be
able to be free from those platforms to
earn more themselves and but still have
abilities to monetize that content in in
interesting ways. And so um again, I
think there are lots of people who are
working on on that problem. I actually
think Google has been, you know, one of
the organizations that has again created
what was the business model of the last
30 years of the web. But the business
model of the next 30 years of the web is
going to be different. And we've got to
think about it in a different way. It's
not going to be banner ads. It's not
really probably going to be
subscriptions. It's going to be
something different. And so this is our
attempt at at one solution, but I doubt
it will be the only one that that
emerges.
>> Now, I'm curious what I'm going to do
because it's a oneperson content
operation. So anyway,
>> well, you should certainly be charging
AI to uh to license your voice.
>> Can I sign up to your product?
>> For sure. Absolutely.
>> Okay. I'm going to email you after this.
And then when it comes to cyber
security, obviously you talked about how
you're dealing with all these
governments that would like to hack into
sites across the web.
>> Have they been able to use generative AI
tools or automated coding to get more to
become more effective at what they do?
>> Yeah, I mean, I think that that anytime
a new technology comes out, bad guys are
going to use it as well as as good guys.
And so we have seen and we will continue
to see some horror stories around, you
know, the family that uh was tricked by
some gang into wiring their life savings
because something someone that sounded
like their their daughter called and
said, "I've been arrested in Mexico. You
know, I need I need to pay to get out."
Um or or or other things. I think we
were seeing a a real rise in especially
out of North Korea, North Koreans posing
as if they were um applicants uh to
various uh various jobs and then that is
um you know allowing them access which
they can then use to um to do to do any
number of nefarious things. All of that
again assisted by AI. So I think that's
um that's been that's been sort of on
the on the bad guys side. The the good
news though is that the good guys, you
know, folks like Cloudflare, we have
been using AI as well in order to not
only detect these things, but get
smarter at detecting attacks earlier in
the process.
>> That's working for you.
>> Yeah. At the end of the day, who wins in
the AI race is whoever has access to the
most data. And and I just think that the
good guys are always going to have
access to a lot more data than the bad
guys. And and so far, I feel like we've
we we have made the the web more secure
with AI over the course of the last two
and a half years um and stayed way ahead
of the attackers. Although again, there
are going to be horrible stories. There
are going to be problems that are there.
Um I I think that it is going to be
harder and harder to trust that
something that you're seeing uh online
is is actually, you know, real. And
we'll we'll have to turn to other ways
that are more secure about verifying
things like identity um and and
authentication.
>> Okay, last question for you. We have 60
seconds. You mentioned you're a believer
in this technology. What does the next
couple years in AI look like to you? Are
we going to hit AGI anytime soon? Like
what are the what's the timeline you're
thinking about?
>> I mean I I don't I don't I'm not I I am
I am so I believe today that 99 out of
cents out of every dollar spent on AI is
just being lit on fire. Um but that one
cent that's out there is going to
generate real returns. It's very hard to
figure out what's kind of just a total
waste of time versus what's not. um you
know we we see a lot of data about how
much you know AI systems are really um
are being used not not so much for
businesses today a lot of the business
applications have been very tough to uh
take on but but a lot of times just for
for like loneliness and social
interactions and things like that so I I
would imagine that a lot more of those
things are going to develop and those
will be sort of the first uses I think
the business applications are actually
going to take longer um and in places
where it's easier to verify the output
as being legitimate it's going to be
easier. So coding, like we see that our
engineers are significantly more
productive by using with using AI tools
than they were before. That's not
causing us to hire any less engineers.
It's just meaning that every engineer we
hire is that much more productive. We
have a huge backlog of things to do and
and and and AI is helping us do that. On
the other hand, you know, I I am I I I'm
still quite skeptical that the AI
customer support agent um that that is a
much harder problem or the AI lawyer um
that is a much harder problem because
it's it's just harder to tell whether
something actually worked or didn't.
There's no debugger in those spaces in
order to figure out if what the AI is
creating was actually true. And so I
think you're going to see just huge
leaprogs in what are things like coding.
Um, but I think it's going to take
longer for us to do things that are a
little bit more uh difficult to verify.
>> Very interesting. You're deeply
optimistic about the technology, but
still think 99
wasted.
>> It's going to be very interesting to
check out. Matthew Prince, great to see
you. Thank you for for coming on the
show.
>> Thanks for having me on.
>> All right, everybody. Thank you for
watching and listening. We'll see you
next time on Big Technology Podcast.