Can We Save The Web From AI? — With Cloudflare CEO Matthew Prince
Channel: Alex Kantrowitz
Published at: 2025-08-13
YouTube video id: iIsXs_hugao
Source: https://www.youtube.com/watch?v=iIsXs_hugao
How will the web fight back against the wave of generative AI that is ingesting all the content on the internet but not paying for it? We're joined today by Matthew Prince. He's the CEO and co-founder of Cloudflare and has been on the war path attempting to write the ship. Matthew, great to see you. Welcome to the show. >> Thanks for having me. >> Can you start by giving us a sense as to what is happening to the web with the rise of Gender AI? talked about it a bit on the show already, but I want to hear it from you. >> Yeah, absolutely. So, the the business model fundamentally of the internet over the last 30 years has really been driven by search. Uh you search for something that generates traffic. It takes you to content that someone has created. And then that content owner, that content creator can drive value really in one of three ways. They can sell the content itself, sell subscription to it. We see plenty of that these days. They can put ads up against it or they can just get the ego hit of knowing that somebody cares about and is reading their stuff. And that's really how the web has been built today. It's been built chasing that traffic. What we're seeing though is that for the first time in history, searches across the major search engines, Google in particular, are actually on the decline. And what's replacing it is more and more people turning to AI. And the difference with AI is rather than giving you 10 blue links that you click to and find the answer, now what AI does is it tries to give you the answer itself. And that's meaning that people aren't going to those original sources. And if they don't go to the original sources, then that means that you can't sell a subscription anymore. You can't put ads up against it. You don't even know that people are actually getting value from your stuff. And so what we're really worried about at Cloudflare is if there if the incentives for creating content go away, why is anyone going to create content in a new AIdriven future? >> So talk a little bit about how many pages these AI bots or search engines have crawled in the past and how much traffic they've delivered for each crawl and where it's gone to today. Yeah, you know, I think that the the the deal that Google made with the web starting 30 years ago when Larry and Pa and and Sergey started working on on the project was basically let us copy your content and in exchange will send you traffic that again you can drive value in you know one of those three ways uh from and we have very reliable data at Cloudflare going back 10 years looking just at Google and and the the the the metric that has stayed very consistent over time is how much Google crawls the web. They've actually crawled at a very consistent rate. Uh over the last 10 years, over that same 10 years, we've actually added two billion internet users. So, we were at 4 billion internet users about 10 years ago. Today, we're at about 6 billion internet users. So, you'd imagine it's actually gotten easier to get traffic over that period of time. But that's not what's happened. Um what instead has happened is that back in the day if you if you take sort of take 10 years ago as the litmus test today it's almost 10 times as hard to get a click to get a visitor from Google to your site. What's changed? The answer is that Google has started providing more answers directly on the page. So if you search for something like when was Cloudflare founded there will be an answer box at the top that will say September 27th 2010 you know is the day that we that we launched and you don't have to click to any link. In fact, about 75% of queries to Google now get answered on Google itself. And what's changed in even just the last six months that's accelerated this is they've rolled out AI overviews. And we've tracked this from region to region to region to region. What we see is as AI is giving you the answer without you having to read with the original content, the the amount of traffic that Google is sending to these these sites has gone down and down and down. And that's the good news for publishers. If Google has gotten 10 times harder to get traffic from over the last 10 years, Open AI is a whole different beast. In OpenAI's case, it's 750 times harder to get traffic than it was from Google just 10 10 10 years ago. Uh in the case of something like Anthropic, it's 30,000 times more more difficult to get that traffic. So why is that? The answer is I think people are trusting the AIS. They're reading this derivative content and they're not going back to the original source. But the problem is if you're not reading that original source, then the original sources have no way of generating value. They can't sell subscriptions. They can't sell ads. They can't get the ego hit. And and that over time is strangling the very incentives on why content is being created. And that's the problem that we started to really focus on about 18 months ago. And and then just today on July 1st, we announced that we are um hard blocking the AI crawlers unless they will actually compensate content creators for the content that they're creating. >> Okay. And we're definitely going to get into your technological solution. So that's that's coming. But let's talk a little bit more about this problem. So I think the number that you shared recently was anthropic will crawl something like 60,000 pages. That's correct. >> For one click that's that's sent and OpenAI was somewhere in the like 10 do you remember 10,000 range? 1500 uh pages now for every one click that they send you. >> And you know I I have to say I'm surprised that publishers are seeing a problem now only because these AI products are really in their infancy. Yeah. >> I mean Anthropic Cloud isn't used by very many people at all when you think about the scope of the web. OpenAI has 500 uh million weekly active users. It's pretty good. Uh but really nothing compared to the amount of traffic that you see on the web every day. And I guess Google might might must must be the problem. So So just explain why this is already showing up for publishers because this is the infancy of generative AI. >> Yeah, I think I think it is, but it's one of these um C changes that we can just see happening. So again, for the first time in history, searches to Google actually dropped in the last period over period. Okay, >> this is Google has actually reported this and it actually came out in the Apple uh trial as well where they're seeing more of this traffic actually going to other sources and so I agree that it's it is a drop but what we're seeing is that is the trend that is the direction that things are heading and even Google itself is looking more like an AI chatbot and less like a traditional search engine and so if that's the case I think that the time for publishers to panic is now if we wait where more and more traffic gets strangled less and less is going to it. Again, I think that that's just going to mean that over time we'll have more consolidation in the media industry. We'll have less and less content. We'll have actually more salacious uh headlines as people are chasing the the content that that is is left that's out there. And and we need to actually make a change to make sure that we can continue to support publishers because I do believe the future of the web is going to be an AIdriven future, not a searchdriven future. and that AIdriven future just doesn't have the same incentives and doesn't support the same business model that the old searchdriven web did. >> Okay, I'm going to poke at this a little more. >> Yeah, >> you mentioned that you can now search Google when was Cloudfare funded founded and you'll get the answer. >> That's something that Google's been doing for a long time. You could ask like when was Martin Luther King Jr.'s birthday? Even before generative AI, uh they were giving you these answers. So, is it that the the magnitude has changed? And and if so, from the standpoint of a consumer, could this be good? I mean, it's pretty annoying to uh type this question into Google, when was Cloudfare uh uh founded, and then have to click to Cloudflare's website to get the answer that Google could just surface for you. And so much of the web has sort of become effectively this service of Google queries >> where websites don't really need to exist. >> Well, you know, so absolutely this has been happening for a while. And if you look at up until 6 months ago, the ratio 10 years ago of crawls from Google to clicks was two crawls, one click. Um 6 months ago, it was up to six crawls, one click. And that's all because of the answer box. Um what the AI overviews, which they've rolled out over that time, have done is they've taken it now to 18 crawls to one click. So yes, it it is a situation of, you know, the frog boiling in in water, but that's it has gotten progressively worse. And I think across the media industry, it's gotten harder and harder to actually survive as a as a publisher. And so what I worry about is yeah, you know, publishers are struggling at they were struggling at 6:1, they're struggling at 18 to1. I think they're dead at 250 or or 1500 to one that we're seeing with OpenAI and completely dead at 60,000 to one we're seeing with something like Anthropic. And so that is the direction that things are going and that's a challenge. I think you're exactly right on the other point as well that the the the challenge here is that this is actually a better user experience. That's why more of the web is going to turn to AI. Um it is great that you can type something in and you can get back an actual response as opposed to having to hunt for it yourself. That's a better user interface. And so that absolutely is is is going that direction. I'm not arguing and I don't think anyone is arguing that we should just get rid of AI or that we should go back to sort of 10 blue links on Google. What I am saying though is that the fuel that runs all of these AI systems. The reason that Google can tell you when Cloudflare was was started or what Martin Luther King's birthday was or something like that is because somebody is doing the work of that original content creation. That original content is the fuel that fuels Google. it fuels all of the AI companies. And if we strangle off the business model of those places, if we strangle off the incentives for content to create uh for content creators to create content, then we're actually going to end up strangling the AI systems as well because there's if there's no content to train on, then the AI systems are going to be pretty stupid uh for for that. And so I think everybody agrees that there has to be incentives that allow content creators to continue to be compensated. The question is what does that incentive structure look like? And that's again what we've been really spending a lot of time looking trying to figure out. >> Ken, I just want to ask you one more question about crawls. >> Y >> um I think that sometimes you know you crawl to like put a a in search you would crawl to put a website in your search engine >> or a page in your search engine from a website. Are these generative AI bots crawling to do something similar just to surface the information from these pages or is some of the crawling being done in service of training their models? Because if that's the case, it's actually not as big of a deal uh because it's just being fed into training. I think the problem is taking that like direct query to answer uh uh behavior and sort of bringing it into the search engine. So do you know is it training or is it just surfacing answers? So I think there there are two different parts of this. There's there's definitely training and then there's what is closer to a search like experience. If you're if you're familiar with this, it would be something like rag or something where you're actually uh getting that real-time data in order to augment the foundational model. I think in both cases though, you're actually costing the content creator something. Um there there is literally they're paying for that traffic. They're paying for that load that the crawlers are are pulling off of it. They're also it is the intellectual property, it's the data, it's the content of these providers that they're using to train the models. And so there's value that the AI companies are getting. If if there weren't, they wouldn't be crawling, right? But there's no return of any compensation or any reward. Again, in the old days of Google, the trade-off was let us copy your content and in exchange we'll give you traffic. What has happened is the frog is boiled in water and now everyone is saying let us copy your content and we will give you nothing in return. And so what we're saying is simply we need a better deal a deal for a new AIdriven future. And that should say if you are getting value from the thing that I created then you should compensate me in some way for it. And it may be tiny amounts of money but at some scale that actually turns into something that can allow a content creator to continue to have an incentive to create content over time. If we don't do that, if we don't give content creators the incentives to create content, they'll stop creating content. So, I think you're bringing up a key point here, which is if people are like, well, you know, I'm not necessarily seeing publisher content show up every time I'm on an LLM, what you are seeing sometimes is the product of the publisher that's been used for training. And even if it's like under fair use, totally fine because it's being transformed and, you know, something crawled from big technology or the New York Times is now being used to, you know, help basically because they're just trying to figure out what word comes next in English language, you know, give you an answer about summer camp. Um, the publishers are actually enabling that and u and and every time an AI crawler hits a publisher website, they have to pay. And I do you work with Wikipedia because they've been loud about this that like the server costs that they have to pay have increased exponentially but those aren't human visitors they are AI bots crawling Wikipedia talk about so there's there is a real cost to just supporting this crawl and before we even talk about intellectual property before we talk about anything else like the the content creators the publishers are having to bear that cost and so at a just a simple fairness level like why Should they be bearing the cost in order to train, you know, these multi-billion dollar, you know, AI companies that are out there? There should be some some value which is given back. But I think it's even beyond that. I I don't even think we have to get to I mean, you used legal terms like fair use. And I think that's very much up in the air right now. We literally had two different California cases that came out on both sides of that issue. Is training on content fair use or not? And I think it's going to be a coin flip where different courts are going to say different things. And I don't think it's a clear answer there, but I think it's a more fundamental thing, which is if you're doing something to create value, you should be getting some sort of of compensation for that. If if somebody else is is imposing a cost on you, you should be able to charge them to offset some of that cost. And if something's not if they're if if someone's not willing to pay that, then they shouldn't be taking your content in the first place. up until now and and everyone's focused, you know, the New York Times is suing and I mean a bunch of people are doing it. Everyone's focused on the legal issue. I actually think that before we even get to the legal issue, the first step is to actually take the technical steps to give content creators back control over the content that they're creating and let them have the choice on do I want to give access or not? Do I want to charge for this or not? And then done correctly, there should be a marketplace where content creators and AI companies come together and say, "Hey, I created this piece of content. We I think it's super valuable." And the AI company says, "Yeah, maybe it is or maybe it's not, but here's what we're willing to pay." And maybe they meet the clearing price. Maybe they don't meet the clearing price. But that marketplace needs to exist because otherwise there's no way to convey value. There's no way to derive value from content creation. And again, I just need to hammer this point home. If we don't give content creators an incentive to create content, they'll stop creating content. >> And it sounds like, by the way, so you're not a skeptic of the AI technology. You believe that this AI generative AI thing is going to work. >> Yeah. Not only that, I mean, it is it is already clear that it's going to be the interface of the future of the web. So, we're going to move from what has been the dominant interface of the future of the past of the web, which was search, to what the interface of the future of the web was going to be, which is very much going to be AI. So I I believe I'm I am I I believe AI is going to get better and better and better. I actually think that um done correctly um content can be created in such a way that will make AI better and that you can create incentives for for doing that. Um what I worry about is in order for AI to get better, you have to have original content. People have to be going out and creating that. And right now we're strangling off all of the incentives for that content creation, which not only hurts content creators, it will ultimately hurt the AI companies as well. So, I was speaking with someone who does uh who works in data uh labeling or data creation for large language models last night. In anticipation of this conversation, I was like, you know, one day what you're doing might look almost exactly like what web publishers are doing where like you might be hiring PhDs uh and having them like write their uh information and you you know feed that right into a um into an LLM's training set. And there might be let's say historians. So if you take like a world history website, the historians that are writing the web pages for that world history website, they must be just like maybe one day they're going to be writing those world history articles and instead of publishing them to the web, selling them or feeding them right into chat GPT. >> Do we do we lose anything if the web goes away and it's just content creators selling stuff to large language models? Yeah, you know, I think the black mirror kind of dystopian future is is not that, you know, content will stop being created and journalists will stop existing and researchers will stop existing. I think the black mirror future is that we actually go back to something like the time of the metache where we have maybe five big AI companies and they each employ a set of journalists and a set of researchers and a set of set of folks that they become effectively the institutions of knowledge and they they have a they have salaries for all their their academics that are on staff. They probably each have different you know maybe one of them is the conservative AI company and one of them is a liberal AI company. You can again you can very much see that has actually been the natural state of media and the natural state of controlling information for quite some time and you could imagine that all of that research actually consolidates behind each individual AI company and every different academic out there is just a is just is is basically an employee of open AI or anthropic or Google or Microsoft. I think that's a pretty bad outcome. Um because again I think that we the web has been so amazing at distributing and democratizing access to information that I think we want to create that incentive and so I I think what we're trying to do is say what's the step you know a few steps before the sort of you know all the academics are employed by one of the AI companies and I think the answer is you allow the AI companies to pay for the content that is actually valuable to them that fills in their models and makes their models better and then you create incentives for independent journalists, independent researchers to actually be able to create that content to augment those AIs while still, you know, being valuable. My my this won't happen, but my sort of uh, you know, optimistic version of the future is humans should get content for free again because we've kind of paywalled way too much, frankly. and and robots should pay a ton for it because again every time a robot ingests something it's in service of hundreds of thousands if not millions of different humans. So robots should pay for that content. We should get back to a place where then humans get that for free. Again that's I I I think it's going to be hard for us to get there. But that's the again the future that I think is is actually the kind of optimal future. So, someone hearing you and and looking at this through a critical lens might say, "Look, Matthew, uh, publishers depending on web traffic are barking up the wrong tree." That selling eyeballs for CPM fractions has not been a good business for a long time. In fact, we had a guest on the show that recently said, "Listen, like I'm he's a journalist, but he's like, "If I thought that traffic was the way to go, uh, I'd be out of business a long time ago, and what you really need is an audience that will, let's say, subscribe to your newsletter or listen to your podcast, uh, maybe come to your events, and we've already moved past this business model of trading traffic for dollars, in which case this isn't an existential threat." What would you say to that? >> I think I mean, even then, you're still trading traffic for dollars. you're just trading it for subscription dollars, not ad dollars. That will go away as well because what will happen is the AI company will ingest the podcast and then summarize it on on their page. And why would they ever buy a subscription to your podcast? Why would they ever sign up for your newsletter if they're AI agent can just simply say, tell me everything that was relevant in this particular podcast or newsletter? >> Because there's an experience of listening that's enjoyable. People do that in some part for for yeah the entertainment and the leisure value I think that's how they learn. >> I think the AI companies will do a very good job at creating that experience as well. >> So you think they'll just create like competing >> Oh absolutely for sure. >> I used to think that this was like such a pie in the sky and lunatic idea until I listen to Notebook LM. >> Yeah. >> And like we've had multiple people on my YouTube page be like did you license your voice to Notebook LM? And I'm like no but the fact that you're saying that is pretty concerning. >> Totally. And I again, I think that's the inevitable future. We're going to want to have hyper customized podcasts that are in exactly the voice that we find the most reassuring. And AIs are going to create that for us. And again, they're going to be fed by original content creators that are out there that give them the ideas, give them what what to talk about, give them the news of the day. What what I think is we have to move even past the business model of subscriptions. We've got to get to something else where you as a content creator are being compensated for the content. The the way I think about it is every one of these LLMs is a little bit like a block of Swiss cheese. Um they've got, you know, a lot of stuff there, but there are big holes that are in it and content that is value for valuable for them are the ones where they actually fill in those holes in the Swiss cheese. And so what I would imagine in the future is that you're able to actually surface what are the places where there are holes in the Swiss cheese as as an AI and then allow content creators to create content that fills that in. My my favorite example of this is I was I was in Stockholm a couple weeks ago meeting with Daniel A because there really is nobody who has done more to compensate creators at scale than Daniel. Daniel Dan was founder of Spotify and and they they've done a just a amazing job at at at doing this. And he told me he told me a story in a long conversation. and he said, you know, one of the things that we do at Spotify is we actually take the searches that people run at Spotify, you know, that are things like, I want a song with a reggae beat about how much it, you know, sucks when your, you know, your sister runs away with your car >> and has happened. >> Yeah. Or whatever. And it turns out that they don't have good things to fill that in. There are content creators out there that are making tens of millions of dollars a year just creating content for those searches that don't have good results right now because Spotify surfaces that list of things where they don't they don't give those results. I think that that's actually beautiful. I think that's actually really amazing where they are showing where is there something that there is human need for and then how can we actually then um create content to fill that human need and then monetize it you know through what they're what they're doing. I think the same opportunity exists in the AI space where these AIs actually are able to say this is a p I can tell you how valuable this new piece of content is for me and they can and you can rank it and then that allows you to create a marketplace where they can say listen that new piece of information is so valuable that I'm willing to pay you for that. And I think that done correctly that then gets us to more original content creation. And it gets us to less sort of meto copycat style uh journalism. Same thing in in research. It gets us to maybe a place where we're we are doing original research and getting rewarded for being more original as opposed to being more salacious. >> Yeah. It's interesting. YouTube has a similar thing where there's an insights or an inspiration tab and they give you like the title, the description, and the thumbnail and they're like people are searching for this. Go out and make it. >> Yeah, that's exactly right. And I think that that's actually a incredibly valuable thing that that's making humanity better as opposed to, you know, yet another story that's just chasing, you know, the most salacious headline that you that you can get. >> So you're talking about this idea where publishers might sell their the the ability to crawl. Yep. >> To AIs. Um that is also assuming that content is scarce. And so I want to run this other idea by you, which is that if we had the same amount of content that we have today, that's a great idea. But what we're seeing now is this explosion of content creation that's made through generative AI. Like it's it's kind of funny like every time you see like these suggestions that we're talking about, YouTube's making these suggestions because clearly there's traffic to be had. I'm sure there are already YouTubers today that are feeding that into chat GPT, spitting out a script, running that through V3 and Google, and then posting the videos and cashing in on traffic. So, there's just going to be and we're in the middle, I believe, of this explosion of content. Actually, you probably have better data on that than my suppositions. It almost feels like a dust of the web where like you know if there if the ability to create content is constrained by a human's ability to create content then you have something to bring to these AI companies. But if human plusbot content starts to become the norm there's going to be so much then even if you're creating highquality stuff it's not going to matter very much to these generative AI companies. What do you think about that? >> I I think it's still I so first of all I think that there's the pure AI generated content. There's lots of research that shows that training AI on AI data uh is sort of like that old Michael Keaton film Multiplicity where basically every copy of something gets worse and worse and worse and and again that that feels like that's going to still be still be the case for quite some time. May in the future um robots be able to go out and do you know interesting reporting from the field? May they be able to do you know interesting research? For sure. But today um I I I think that that interesting research, that interesting original content, that interesting insight that that comes from the work that right now only journalists and researchers and others can do is still the most important thing for filling in those gaps in the in the Swiss cheese of of AIS. What is just again high volume lowvalue content? My hunch is that that's if we score it correctly going to be exactly what it what it is, which is low value content. And so it should it should be rewarded very minimally. Um I I I I like to ski. So I I live part of the year in Park City, Utah. You're in the right place. I I Yeah, I care I care enormously about the snow forecast. There is a a uh a forecaster uh in Utah named Evan Theer. He he he writes these incredibly precise weather forecast where he will literally tell you it's going to snow this much on this run and this much on this run. And again, I actually pay for his content because that's super value for me. I am going to be more willing in the future to pay for an AI that has actually licensed Evans content back from him than I would to pay for an AI that doesn't have that content because again that content is going to be, you know, super useful and unique and valuable to me. And so I think actually what it will do is we as we have um more AI systems that are out there is it will cause you to look for more original creative content and that's going to be the thing that the AIs are going to be the most willing to pay for and that again I think is actually a beautiful thing where we're we are instead of creating incentives to create more and more salacious headlines and chase traffic we're creating incentives to create knowledge that fills in those places in sort of the Swiss cheese where there might holes. Taken in aggregate, all of the AIs are probably a pretty good representation of what human knowledge looks like. And so, if we can score them and say, "Okay, here are the gaps in human knowledge and and here are the places we need to fill in," that actually gives a a really rich place for creators to look to create content which advances human knowledge. >> So, you know, DeepMind is working on weather forecasting right now. Uh, this example that you gave of Evan Theer, the forecaster in Utah. Are we that far away from just telling an AI, hey, like you're you're tapping into the deep mind model and weather forecast. I want to ski this route today. What's happening? >> I think we're probably pretty far away from that. But um and and but but again, I think and Evan is going to be always better using the tools of AI plus his local knowledge to make this better. AI just becomes a tool that creative people use in order to tell stories better, get better information, do more research, and and again, I I am skeptical that in the in the short term at least that we're going to have um real value that is created by by training on purely generated content. >> Okay. So, we've talked about your solution. Let's dive into the technological side of it a little bit. We are a tech podcast, so we should do that. So, Cloudflare security company helps websites stay up on the web despite all the threats. >> Yep. >> Um, and let's just at the very beginning kind of talk about like the threats that you see to websites, who's trying to take them down? >> Yeah. >> What h what's happening on that? >> Yeah. So, I mean, we protecting websites is is part of our business. So, is um protecting employees as they go out across the internet. So we we Cloudflare is fundamentally kind of a network that is built with all the performance, reliability, security, availability and privacy guarantees that frankly the internet should have been built with had we all known what it was going to become. But but obviously back in the 60s7s and ' 80s when we were laying down all these protocols, we didn't think about those things. And so Cloudflare is basically reverse engineering uh the internet in order to give it those performance, availability, uh security, reliability and um uh and and uh and privacy guarantees on top of of what what is there. And so today, one of the main uses for Cloudflare would be to you have you're putting a website or a web application or anything online, you want to make sure that it's safe from different sorts of threats. And so what are the threats that we see? I mean every day we go to war with the Chinese government, the Russian government, the North Koreans. I mean, everyone is trying to hack into our customers because who are our customers? Some of the largest banks in the world, some of the largest governments in the world. And they are all constantly under threat and constantly under attack from these these organizations. The media companies actually were a pretty small part of our business. We had some media companies uh that used us, but it wasn't a big piece of it. What what happened starting really 18 months ago is that those companies said, "Hey, I know we hired you in order to stop the Chinese hackers, but we have this new threat that's there." And and frankly, my initial reaction was publishers, they're always whining about the next new technology, like what's what's going on? And it and and they and over and over they said just pull the data, pull the data, pull the data. And it was only when we actually saw the data and we saw how AI companies were taking content without giving anything of value in return that they were actually adding enormous amounts of load and in some cases taking whole websites down because of the amount of traffic that they were sending to it, >> right? They they basically ddos the websites. >> Dust the websites, you know, not intentionally. But but that was the point at which we said, listen, maybe there is something that we can do here. And and you know, at at first I think a lot of the publishers were saying, "Oh, this is this is so hard. There's no way we can stop it. You know, there are these these nerds and they live in Palo Alto and they're so smart and what are we ever going to possibly do about it?" And um a and I just kept saying, "Guys, we we go to war with the like the Chinese hackers. Like we can stop some nerds with the CC Corporation." And I think it took a while for that message to really get through. But now that it has, you know, it was it was it's been really rewarding to see that the vast majority of the world's publishers, major publishers have said this is we need to change the model. We need to be compensated for our content and Cloudflare has the right idea in terms of the technical solution to do that. >> By the way, folks, $60 billion company listed publicly. Um, so it's uh it's one of the bigger cyber security companies on the New York Stock Exchange. Um, but I want to ask you, okay, so we're going to get into this technological solution, but what you said is interesting because what if these AI bots, do you ever think there's a world where these AI bots ingest not just the publishers, but the banking websites as well? Like, are you like a natural enemy to having everything go through that? Because if everything goes through ChatGpt, then these other sites that you secure might not need your services. Um I I think I mean they still there there's going to be some um gatekeeper for how agents and other things access various services online. And I think the the challenges in each of those cases are different. Um in the case of a bank, you might want to say I want to have guard rails that are in place. I want to make sure that this is actually a customer that's accessing account. I want to make sure that they're, you know, they they can they can only conduct transactions that have been authorized by an actual human being or or something like that. Kle actually provides those guardrails and and and makes it so that a bank can say I want to expose my infrastructure to AI but do it in a way which is safe and secure. I think publishers have a different different challenge and so you know in our case we're a way of thinking this is like we have a whole bunch of developer documents which are on our website. We want those to be an AI. We want coding platforms when someone says oh I want to use Cloudflare to you know build X Y or Z for it to be able to spit that out. What we've done is we've actually tried to identify with with real narrow precision what are those pages uh that are on the web that have some some indication that they are uh going to be monetized and and generally that is look at is it behind a payw wall or is it does it have some sort of an ad unit on it like a a banner ad or or or some sort of ad that's there. If we detect that, then we're blocking it by default. But we're not doing this. Again, there's value for AI and we want to make sure that AI is actually getting the data that people want to have have in it. So like the about us page on the New York Times probably should go into the AI system, but a brand new article, you know, with breaking news probably should be restricted and again unless the AI company's actually paying for that content. >> I guess the way I want to ask it is if everything goes into chat GPT, what's less what's left for you to protect? thinking thinking outside of the media world. >> Uh well again I think that 80% of the AI companies are are customers of ours and so so we we protect them as well. >> Yeah. Okay. Sounds good. Just wanted to ask that. I was curious about it. But let's talk Okay. Now so you're going to build a a technological solution that will block crawling. >> Yes. >> And so robots txt which is this code that you put in like the header of your of your site if you don't want to be crawled. That wasn't working. >> Yeah. I mean I think robots.txt txt has two problems. Um the first is some people just ignore it. Uh and so if you ignore it then you can still crawl all you want and there's some just there's some even some big legitimate companies that completely ignore robots.txt and we're really good at basically being able to say okay here's what robots txt says. How are you are you actually following what those those uh what what sort of the rules of the road are? And if the answer is yes, then robots.txt TX is a great solution. Um, but in the cases where somebody is ignoring it, then we need to actually put in place additional technical barriers to restrict their their their access. And so that's exactly what we're doing. The second problem with robust CSC txt is it's not granular enough. So take the Google bot for example. Google's crawler does five different things at least. Uh, one is it checks if you have an ad on a page. It makes sure that if you're putting an ad for Proctor and Gam a Proctor and Gamble product up, it's not against a pornographic site or or something like that. So, it does brand safety checks. Um, the second thing that it does is crawls to index for traditional search, the 10 blue links that are out there. The third is that it crawls to create answers that are in the answer box. The fourth is that it crawls to create answers that are in the AI overview, the newer thing that they've rolled out. And the fifth is that it crawls in order to ingest content in order to put it into Gemini. >> It's a lot of crawling. >> A lot of crawling all through one crawler. And for lots of different reasons, they don't want to split that out into into various crawlers. But right now, they basically make you have a choice. They say you can either block Google entirely, in which case you can't run ads, you don't appear in search, but you don't appear in the AI overviews or Gemini or other things. or they've recently added a tiny flag which which basically just says I'm not going to use this data just for the Gemini piece but you still appear in AI overviews you still appear in answer box we think there needs to be more granularity where there is a difference between taking content and transforming it and a license should say you can't do that without my permission uh versus just taking that content in order to do brand safety checks taking that content in order to do traditional search and so what we've proposed and we're working with the IET TF as well as regulators is extensions to robots.txt to give it that granularity and that actually then allows us to further test to watch you know if does this robot behave in an appropriate way and if the answer is yes then maybe it gets more permissions to do things online. If the answer is no then we will put more restrictions and blockades in place to stop what are again badly behaving robots. >> So what you're going to do now is in addition to that put a wall up technological wall. >> That's right. No crawling. Sorry, enough of you haven't respected robots txt. No entrance. >> That's right. And so the the original like we're we're all familiar with like 404 errors when something is not found. Uh success on the internet is a 200 uh response that comes back to you. There's actually an original uh um one of the the protocols set out a 402 response and that response says payment required. And so we're actually tapping into that exact original specification to say when a robot tries to access access a page where there's an intent to monetize it. So, it's either behind a subscription or it's got ads on it that there is an ability for us to say 402 payments required and then there's a negotiation in in some cases and and at first that's going to be largely large publishers with large AI companies doing deals like what Reddit has done or what the New York Times has done or what others have done where they have licensed the content and then certain robots get access to that. But in other cases and I think over time that will be a dynamic process where maybe a smaller AI company or a smaller publisher will say hey here's what I would charge for this content. Cloudflare will surface like how valuable that content would be for that particular AI and then the AI companies can decide is that worth it or not and it might be you know a a very small transaction maybe a fraction of a penny or maybe a few cents or in some cases content that is really valuable might be worth hundreds or thousands or millions of dollars. You could imagine Taylor Swift, you know, is about to release a brand new song and the lyrics uh get published. How valuable is that for an app for teen girls who are lonely and want to talk about things? Probably pretty valuable and especially valuable if you could have exclusive access to it for some window of time. And so that's the sort of thing where I think a marketplace over time can develop where original valuable content will get compensated and there will be a clearing price in the market once we have that scarcity that's created by that wall. >> Okay. So it's not just a blocker. It's also this marketplace where you're going to have publishers that will >> sell their content. So that's a way where you could have uh useful effective chat bots and potentially a flourishing web. >> Exactly. And that's and that I think is what we're trying to play for. Again, my utopian vision of the future is robots should pay a lot for content and humans should get it for free, >> right? And so to kick this off on June 3rd, as the day turned July 1st, uh you had a party uh on on the top of the World Trade One World Trade Center um where a bunch of publishers pressed a red button to get this thing going. And that includes some very big names. Condinast, time, the associated press, the Atlantic, AdWeek, and Fortune are all going to be part of this >> and and and a lot more. Frank, frankly, there there hasn't been a publisher that we've talked to who hasn't said that this is a change that needs to happen. You're on the right path uh for it. And so across the board, not only the kind of 20% plus of the web that sits behind Cloudflare already, but I think another 20 to 30% that are these major publishers that are out there are all on board and doing that. And what I think has been encouraging is at the same time we've been having conversations with the largest AI companies and all of them agree that content creators need to be compensated for their content. They all agree on that. the devil's in the details and some of them are pushing back in various ways. But I've been really encouraged that as we've talked to the largest leading AI companies, the largest technology companies in the world, they're actually leaning into this. They all recognize that compens content creators need to be compensated. And I think over the months to come, that's when the hard work will be go down around how do we actually create this marketplace in a way which is fair for all of the different uh providers in the in the ecosystem. Treats everybody um in a way that has a level playing field, still allows new entrance, doesn't just reward the largest companies with the biggest budgets that are out there. Um make sure that you know legacy providers like Google are treated the same as you know newer providers that are there. That's all going to be really tough, but I am incredibly encouraged by the conver conversations I'm having, not just with the publishers who are all on board, but actually with the AI companies who recognize that something needs to change. That's interesting that they're they're recognizing this u because the sense that you get is they you hear these announcements of deals like OpenAI paying X million to W the Wall Street Journal to be able to include their articles or Dow Jones u and the sense you get is that they're just kind of payoffs to not get sued like the Sam Alman very happy uh very clearly is not happy with uh the New York Times uh pursuing OpenAI and especially the the actions that the Times are taking in their lawsuit like forcing OpenAI to preserve their chat logs which I think is wrong. Uh but but it is interesting. So what do you think about is there are we going to see an evolution from these one-off deals to this marketplace style world? >> Well, I think that I mean we've seen this story many times before. I mean Napster was along it was a a wild west. There was a bunch of lawsuits from you know the publishing me the music industry uh targeting Napster and the like. And then along comes iTunes, which starts out as 99 cents a song, but eventually evolves into what is much more uh much closer to a Spotify model of a subscription and a pool of funds that then get distributed out to all the creators. So, so I think we've seen this story before. Um, and I think that one of the things that's really important is that that OpenAI and others are willing to pay for content. They do the deals that are there. And I don't think it's right to just say we'll do a deal to avoid avoid lawsuits. Again, I think that that when you talk to leading AI companies, they understand that people are doing the work to get create content, they need to get compensated for that content. And if it's not going to be through subscriptions or ads or ego, it's got to be through something else. And so, exactly how that happens, we'll figure out. But what I know won't work is if OpenAI is paying for your content, but you're giving it away for free to everyone else. >> That's not going to work. >> Open AAI eventually is like, listen, we want we want to support you. We want to help you out, but we can't be the suckers. We can't be the only ones paying where you're giving stuff away for free. And so scarcity is needed in order to actually have value in any kind of market. And so I think that the the people who have actually leaned into this the most heavily are the ones that have the existing deals with some but not all of the AI companies because they realize that for those deals to be valuable for them to renew for them to renew for more there has to actually be scarcity where they're getting something of value. You can't you can't charge open AI but give it away for free to anthropic. Something needs to actually restrict it and say everyone needs to pay. Everyone needs to be on a on a level playing field and figure out what that that looks like going forward. >> Could there be some collateral damage uh with the solution like you're implementing? For instance, I'm looking at the names of these publications, Condas Time, the AP, The Atlantic. I imagine they get a lot of traffic from search as it is today. So, if you put this blocker up, does that impact their SEO for instance? >> Yeah. So, we've been very very careful to say that the traditional search today is not blocked. um and and even AIdriven search today isn't blocked, but you're going to see us give publishers the tools to differentiate between search indexing uh and derivative content. So, the way I would think about this is the Google experience today. It may be that a publisher says, I still want to appear in the 10 blue links, but I don't want to be in the AI overview or the answer box. And the granularity of being able to say, okay, Google, I understand you use one bot, but we need that to be treated um similarly. And again, I am hopeful and in my conversations with Google, um, I I I am increasingly hopeful that they understand the importance of this and giving that granularity. But if for some reason they don't, I I I am also 100% certain that regulators are paying a ton of attention to this and that around the world you will see them force Google to split their crawler out into into announcing exactly what it is doing. Again, I I I I think that that's kind of the hope hopefully we get to an agreement with Google way before that has to happen. Um, but but that's inevitably I think Google is going to have to say, you know, if you don't want us to use your content for derivatives, you have a way of of of controlling that while still appearing in search. >> Okay, a couple big picture questions before we leave. >> How much bigger is the web getting and is the web sort of accelerating in the size increases that we see? Yeah, I mean it's it's actually been um by by all the measures that we can see it's actually kind of plateaued and has actually flattened out in terms of in terms of uh in terms of content. You see fewer domains getting registered, you see fewer new websites uh going online. I think a lot of that has moved to individual platforms. So more of that on a YouTube, more of that on a Facebook, more of that on a a Tik Tok um that that is there. And I think part of that is because those tools have provided content creators easy monetization tools uh to allow them to to to not have to think about some of those problems. I think that in in a in an ideal future, you would want content creators to be able to be free from those platforms to earn more themselves and but still have abilities to monetize that content in in interesting ways. And so um again, I think there are lots of people who are working on on that problem. I actually think Google has been, you know, one of the organizations that has again created what was the business model of the last 30 years of the web. But the business model of the next 30 years of the web is going to be different. And we've got to think about it in a different way. It's not going to be banner ads. It's not really probably going to be subscriptions. It's going to be something different. And so this is our attempt at at one solution, but I doubt it will be the only one that that emerges. >> Now, I'm curious what I'm going to do because it's a oneperson content operation. So anyway, >> well, you should certainly be charging AI to uh to license your voice. >> Can I sign up to your product? >> For sure. Absolutely. >> Okay. I'm going to email you after this. And then when it comes to cyber security, obviously you talked about how you're dealing with all these governments that would like to hack into sites across the web. >> Have they been able to use generative AI tools or automated coding to get more to become more effective at what they do? >> Yeah, I mean, I think that that anytime a new technology comes out, bad guys are going to use it as well as as good guys. And so we have seen and we will continue to see some horror stories around, you know, the family that uh was tricked by some gang into wiring their life savings because something someone that sounded like their their daughter called and said, "I've been arrested in Mexico. You know, I need I need to pay to get out." Um or or or other things. I think we were seeing a a real rise in especially out of North Korea, North Koreans posing as if they were um applicants uh to various uh various jobs and then that is um you know allowing them access which they can then use to um to do to do any number of nefarious things. All of that again assisted by AI. So I think that's um that's been that's been sort of on the on the bad guys side. The the good news though is that the good guys, you know, folks like Cloudflare, we have been using AI as well in order to not only detect these things, but get smarter at detecting attacks earlier in the process. >> That's working for you. >> Yeah. At the end of the day, who wins in the AI race is whoever has access to the most data. And and I just think that the good guys are always going to have access to a lot more data than the bad guys. And and so far, I feel like we've we we have made the the web more secure with AI over the course of the last two and a half years um and stayed way ahead of the attackers. Although again, there are going to be horrible stories. There are going to be problems that are there. Um I I think that it is going to be harder and harder to trust that something that you're seeing uh online is is actually, you know, real. And we'll we'll have to turn to other ways that are more secure about verifying things like identity um and and authentication. >> Okay, last question for you. We have 60 seconds. You mentioned you're a believer in this technology. What does the next couple years in AI look like to you? Are we going to hit AGI anytime soon? Like what are the what's the timeline you're thinking about? >> I mean I I don't I don't I'm not I I am I am so I believe today that 99 out of cents out of every dollar spent on AI is just being lit on fire. Um but that one cent that's out there is going to generate real returns. It's very hard to figure out what's kind of just a total waste of time versus what's not. um you know we we see a lot of data about how much you know AI systems are really um are being used not not so much for businesses today a lot of the business applications have been very tough to uh take on but but a lot of times just for for like loneliness and social interactions and things like that so I I would imagine that a lot more of those things are going to develop and those will be sort of the first uses I think the business applications are actually going to take longer um and in places where it's easier to verify the output as being legitimate it's going to be easier. So coding, like we see that our engineers are significantly more productive by using with using AI tools than they were before. That's not causing us to hire any less engineers. It's just meaning that every engineer we hire is that much more productive. We have a huge backlog of things to do and and and and AI is helping us do that. On the other hand, you know, I I am I I I'm still quite skeptical that the AI customer support agent um that that is a much harder problem or the AI lawyer um that is a much harder problem because it's it's just harder to tell whether something actually worked or didn't. There's no debugger in those spaces in order to figure out if what the AI is creating was actually true. And so I think you're going to see just huge leaprogs in what are things like coding. Um, but I think it's going to take longer for us to do things that are a little bit more uh difficult to verify. >> Very interesting. You're deeply optimistic about the technology, but still think 99 wasted. >> It's going to be very interesting to check out. Matthew Prince, great to see you. Thank you for for coming on the show. >> Thanks for having me on. >> All right, everybody. Thank you for watching and listening. We'll see you next time on Big Technology Podcast.