Vibe Coding: Everything You Need To Know — With Amjad Masad
Channel: Alex Kantrowitz
Published at: 2025-08-11
YouTube video id: dCZc2kwtUIg
Source: https://www.youtube.com/watch?v=dCZc2kwtUIg
Don't go into this thinking you can just have prompt and have an application pop up at the other end. At least set an afternoon uh to to give it some good effort and try to get like your first app in. And once you do that, you just get addicted. I have the stat here that Replet uh has multiplied uh by its revenue by 10x in less than 6 months to 100 million in annual recurring revenue. Is that growth vibe coding or is that growth AI coding? >> Vibe coding. Is AI coding just a hobby or the beginning of a technological revolution that empowers everyone to build? Our guest today, Amjad Msad, the CEO of Replet, has some answers and we're here in Replet headquarters in Foster City to speak with him. Amjad, great to see you. Welcome to the show. >> Thank you. I'm excited to be on the show. >> So, we're going to talk today about vibe coding and AI coding, which are two similar but different things. I first wanted to speak with you about vibe coding which is effectively you write a prompt and then the AI goes ahead and builds software for you. This is something that replet enables. This is something I've tried. What are some of the use cases that you're finding people are actually uh having effective uh approaches with this? Like where are the places where people are doing this? Well, >> there's like broadly three use cases. Um one is personal life, family life. Uh, so you know, for example, like a lot of people like to do health tracking. I'm going to track my sleep. I'm going to pull in data from my Fitbit. I'm going to like have the AI sort of process that data. I'm going to have this app on my phone that I use every day. Or I'm going to build an educational app for my kid to learn math or reading. or we're going to have like someone built like a chore hero for their family to like, you know, have an iPad on the wall and like here's who's doing the most chores and gamifying their family life. Uh you'd be surprised how popular this use case is. And so, you know, uh in the in the niche that I've always been in, which is uh like creator tools, uh that there's always been this idea of personal software, malleable software. By the way, this goes to the early computing history. So, you know, for example, like um Apple had this piece of software called uh Hyperard. Uh Hyperard allowed anyone to make personal software. You know, there's Vidual Basic. It's been attempted so many times, but for the first time now, anyone can make software. So, there's a lot of there's a class of personal software. We have a mobile app and you can use that to make software. It's the most fun thing to do is sit down with your kids, five, six, seven years old, and just brainstorm games and make games with them. So, that's one bucket. >> Wait, before we go to the next bucket, I want to ask you a question. So, um, does this say something about the software industry that the software industry just hasn't served so many use cases or are these use cases noneconomic or is it possible that people will build things for their family and then next thing you know, they can serve that mass market and it becomes a business? It is certainly uh you know there's certainly a market there and you can certainly make a lot of money from that. >> Okay. Because when I think about like this concept and we're going to get to jobs but this concept that AI is going to take our jobs to me it's like wait there's so much left to build. If you think just about what we have today and maintaining that maybe it will but there's so much that software has not yet touched that it seems to me that there's more opportunity out there than people are >> just to touch on your earlier question and you tell me how deep you want to go because I can talk about this for hours but when in the early computing pioneers uh they all had this idea that um computers are this the thing that makes computers special uh is this idea of program programing programmability right the moment we had a program programmable machine uh that was first invented by uh vonoman um and it's the same architecture that we use today the thinking was oh anyone can use a computer to program to solve problems to to build applications and all of that it it didn't get mass um consumer adoption and the reason is because coding is is hard And so you you had um the Xerox Park as a research in Palo Alto Palo Alto research center. They developed a GUI. One day they they invite this uh up and cominging entrepreneur called Steve Jobs. Steve Jobs looks at desktops, menus, items and and he's like he has the Apple 2. Obviously Apple 2 is also still command line. You can write some basic uh and he's like okay this is the key to get mass consumer adoption of computers. And so he copies what Xerox had and he builds it into the Mac and obviously later Windows and Microsoft copies uh UI and then suddenly you computers are usable by anyone and this is amazing now like billions of people use computers and now we have phones based on the same idea. What we lost is this idea that anyone can program a computer. So that's something I've been passionate about all my life is like computers should fundamentally be programmable and there's been a lot of different iterations with visual programming. We had the no code low code revolution that happened like maybe you know 10 years ago. I would say it never reached the full potential. >> It was more of a buzzword than reality. But now >> I think I think it is is a multi-billion dollar market for sure but it's not a trillion dollar market. And I think this idea of like anyone can make software is such a massive market. >> Okay. Okay. So, bucket number two, >> uh, bucket number two tends to be entrepreneurs. And so, everyone in the world has ideas. Uh, people build so much domain knowledge about whatever their their field of work, right? Uh, I was hearing a story today of an Uber driver that is starting to make an app with Replet and the app is about logistics. He was a truck driver before and so he had domain knowledge about how to uh manage fleets for example but he never was able to to make it into a software because he didn't have the skill maybe he didn't have the capital to go you know commission a contractor to do it and suddenly he can do it. So, you know, pick anyone on the street uh and they all in whatever industry they're in, they realize that there's uh uh a need for a piece of software or technology that no one has built because they don't have that deep domain knowledge. Um so we see entrepreneurs from all walks of life. uh one of uh our favorite one we talked about it publicly uh uh on on our rapid social media channels a doctor from the UK that he's like you know there's all these apps around managing doctor uh patient relationships but they never it's not fully integrated so you know you have Zach do you can go make an appointment but you know how do you manage your prescriptions uh can I track my patient over time their progress can I get information from their uh Wi-Fi connected scale from their Fitbit from you know and so he built this comprehensive platform uh he got quoted by an agency £100,000 and he built it less than 200 uh you know British pounds >> um >> not 200,000 200 >> 200 200 you know pounds >> so this is stuff that's being vibe coded effectively prompt in I want to build this software >> and then Replet will go build it >> yeah and this is now a startup and we've had startups start on replet multi-million dollar uh revenue run rate. Some of them have raised at like a half a billion dollar valuation. So we have all the way from small entrepreneurs to startup venture scale entrepreneurs. But this is this gets me really excited because America has always been about entrepreneurship and this is really what attracted me to this country. But actually if you look at the stats entrepreneurship over time uh although we hear about what's happening in in the Bay Area and Silicon Valley there's all startups every day but the rest of the country actually uh you know new firm creation has been going down over the past hundred years there was an uptick during co where everyone's sitting at home it's like I started my business >> right exactly which was great but that actually we had a regression to the mean and I think with AI we're going to see that explode again so that's the second bucket entrepreneurs One more bucket. >> Third one is people at companies like this one. Uh so actually I'll give you a story from our HR department. Um we have a small HR department. Replet is kind of a lean team. We're 80 people. Um and so we have a lot of these SAS tools. We pay tens hundreds of thousands of dollars to to do every specific kind of function. And sometimes they don't really fit our use case. We think it's they're too expensive. Uh so this HR person had a need for an orc chart software uh that can visualize the ORC chart that can you know add remove people maintain a history can look back and see what happened what changes it it did and went on the market and saw that uh none of the software captured that exact bespoke use case where she wanted to connect it to our uh kind of more uh other HRISH systems or databases. is and they were they all needed you know uh they were all very expensive and needed a lot of IT support. So she went into rapid and built it vibe coded it in 3 days and so that meant that we have a system that exactly fits our use case and that also meant that we're not paying 10 20 $30,000 a year for a piece of SAS software and that's happening across the board. We see companies saving hundreds of thousands of dollars replacing SAS software with built-in with internally built software. >> Now, do you need to be someone with some technical background or some technical knowhow to be able to do this? Well, because I'll give you an example. I mentioned to you before we start recording, I opened a Replet account this week. I wanted to build a simple choose your own adventure game. I think it was called History Havoc where you can work your way through different history scenarios. Um, but it just didn't get to the point where I wanted it to be. >> How long did you work on it? >> So, I spent about an hour on it. Not a lot of time. And I also, full disclosure, I'm just on your starter plan. I'm not paying yet. >> Yeah. >> Um, >> but I could I couldn't get it to work. I also tried to build this story tracker. >> Yeah. >> And it wasn't able to crawl the web the way that I hoped it would. So it still seems like this to a lot of people that this is something that is helpful if you're technical, you want to make a prototype, but these use cases that you're giving seem to be full-blown companies or working pieces of software. So explain that disconnect. >> I think it it requires grit. Obviously there's like stocasticity and in the machine learning models. >> So explain what that is. um uh the same prompts can uh put you on a path of success based on randomness that's happening inside the GPUs. Uh there's this parameter in large language models called temperature and temperature is literally like how random is the sampling of the words coming out of the of the LLM. So the LM the way it works it you know you give it a piece of text and it tries to complete the the next word the next token as we call it and the way it happens it generates um a lot of candidates so you know the red red fox you know jumped slept whatever but like jumped is the top one you know it's the highest probability one that that h you know the model have seen it occur after the sentence and millions of of cases uh but you know you have the sampler and could be uh randomizing what it picks and that that randomization makes it more creative. Uh there's also inherent random uh randomization inside the like the Nvidia chips or the GPUs. Mhm. >> So this style of software is unlike the software the classic software where everything is discrete input output machine learning models have inherent randomness and that's a feature not a bug that creates creativity right so uh some people sometimes get on a on a bad luck with with a rap obviously trying to mitigate a lot of these problems but I would say it's also requires grit like the g the Aim you just described professional programmers coding might take them a two days thing on replet you can do it in two three four hours but it it would require a little bit of grit. So it's not magic and the skills you were talking about the technical skills although they're not required you can build them up over time and our environment kind of shows some of these features as you're working with it. Um, and so I I I would I would suggest to people that don't go into this thinking you can just have prompt and have an application pop up at the other end. I would say at least set an afternoon uh to to give it some good effort and try to get like your first app in and once you do that you just get addicted. So there's vibe coding which is again prompt and then you make an app and then you can um refine it with more English and then there's AI coding uh where you could basically have AI you know complete your code big autocomplete. So what do you think the opportunity is in vibe coding versus AI coding and where do you think the energy is in the AI industry today? I gave the analogy of the the history of computing and I think it's a um very suitable analogy for a lot of what we're talking about. Uh early on in computing we had the mainframes. So the main frames really big roomsized computers. Uh IBM used to make them, large corporations and governments use them in universities, but every day people didn't have access to them until Apple created the Apple 2 and that was the first mass consumer market uh computer. And since then we've had Windows and all these uh devices. The mainframe was already serving the professionals needs, but it wasn't serving the consumer needs. Now if you look at the market for PCs versus the professional workstations, Sun micros systemystems all of that which used to be the case uh the PC not only was a much bigger market eventually it subsumed the uh the more professional grade software and this is this is called the disruption theory. um you know a lot of your audience that might be into business history or or theory uh Klay Christensen used to be I think a Harvard business uh school professor and he wrote this book called the innovator's dilemma and the idea is that a lot of technology start at the lower end and because their mass market appeal they onboard a lot more users and customers and over time they reach certain economies of scale and they subsume even the upper end of the uh of the of the market. Currently the upper end of the market is what you were talking about with AI coding tools, right? So um there's like 30 million developers all over the world, maybe a little more now. Um those are professional developers that went to computer science uh classes in in in college. They were trained for four or five years and now they're um you know working at companies. If you make those developers 20, 30, 40% more productive, you get depending on, you know, if you're company of the size of Google, you like billions of dollars worth of productivity, right? So the market is really obvious there. You can go apply it and and get get it, but it's a zero sum market. If you look at copilot which is Microsoft's product which was the first to market versus cursor which is the more modern kind of AI coding uh IDE as cursor is eating market share you can see it is almost exactly proportional to copilot uh declining in usage so that's a sign of a zero sum market it is very lucrative and there's a lot more growth to be had there but it is not this fundamentally revolution that we can be going through where it's anyone can make software. >> Let me let me ask it this way. Uh I have the stat here that Replet uh has multiplied uh by its revenue by 10x in less than 6 months to 100 million in annual recurring revenue. So is that growth vibe coding or is that growth AI coding? >> VIP coding. >> Really? >> Yeah. And is are these vibe coding programs or these bespoke programs that people are building with prompts are they in production or are they mostly hobbies that people fool around with? >> Depends on um the first bucket is is more hobby personal life. Second bucket entrepreneurs as you know most startups die. So most startup ideas don't make it to fruition. the 10% of startups that or small businesses that get off the ground, they get the most value out of Replet. Uh and some of them are in production now. Um you know, I've talked about uh a lot of these stories, but you know, for example, we have this uh creator, his name is John Cheney. Uh he's a serial entrepreneur. Used to take him many months and hundreds of thousands of dollars to build applications, and now he can spin up a business and get to million-dollar run rates in in the matter of of weeks. obviously he has experience like he he knows the formula of what it means to be an entrepreneur but people can learn that over time and in terms of the um enterprise um you know we have for example Zillow the CEO of Zillow recently on New York Times Dealbook talked about how everyone at Zillow is using Replet to accelerate product innovation because product innovation no longer depends on engineers. you can have product managers do the entire iteration getting user feedback even without going to the engineers. So it it just like increases we have Dualingo um a bunch of these customers that are really focused on innovating building their second third product uh that are now using replet for for a lot of these use cases. >> So is the use case that you build like a prototype and then you get some feedback and then if everything works out well then you build into the product with your like core engineers. That's one that's one use case. Okay. >> Uh >> that's interesting. >> Yeah, that's one use case. It's really great. It it rapidly improves uh you know the the the time to market. The second use case is operations and internal tools. Uh so for examples like um you know Sears Home Services really old company uh employs people that go and like fix homes. uh and they had an operations team that uh wanted to build a lot of AI tools and software for their field workers to be able to manage their work and their earnings and all that but their software was like this 100-year-old cobalt programs and the engineers were kind of busy kind of migrating that and improving that. So the operations team started using replet to spin up these AI applications that are deployed used in productions by uh those field workers every day to manage their day and and and kind of design the optimal routes to how to maximize their earnings uh per day. So the operations type use cases tend to be deployed running in production. >> Okay. So just so I'm clear, you're are you also facilitating AI coding or is it mostly that you turned Replet into a vibe coding company. >> My mission has always been about how do you enable people to do uh this magical thing that is creating software. It's one of the most magical exciting experiences you would ever have and um I I was a founding engineer at code academy and before that I built open source tools to do that. Code Academy taught millions and millions of people how to code and we changed you know a lot of lives. So the DNA of Replet has always been about how do you make programming more accessible. It was it had like a more developer bent at some point but because Replet is sort of batteries included platform. We give you the database. We give you the authentication. We give you the uh the deployment. We give you the scalability. We give you all of that out of the box. You don't have to go anywhere else to do any of that. It always meant that the people that are getting the most out of it tend to be they're not they're not not professional programmers although professional programmers do use it. I would say like that's 20% of the use cases. >> And the question is then do the people using Replet then come for uh the people who are those professional programmers. There was a funny thing that happened. I watched uh you have a talk at the Semaphore tech event in San Francisco a couple months ago and I tweeted something that you said that in one year or 18 months uh companies might be able to run themselves without engineers. >> And then somebody responded to me uh with this meme where they said founders in public AI is writing 99% of our code. In 6 months we won't need any engineers. Founders in the DMs uh does anyone know a good React developer? $30,000 bonus and I will name my firstborn son after you. So can you explain that disconnect between this view that engineers are going away and this >> still like very intense demand for engineers in the market? >> I never made the point that engineers would go away. I make the point that entrepreneurs can start businesses without needing engineers and that we already see that we already see you know I meet YC companies and uh Y Combinator is the most uh prestigious startup accelerator in the in the world Bay Area and in the past Y Combinator would encourage you to go get a technical co-founder but like we said there's so many people with amazing ideas that don't have a technical co-ounder vender and so they're starting to get into YC and what they tell us is we're just going to build this thing on replet we're going to see how far we can get and they often got get really really far now if you're building a venture scale company and you want to like get to hundreds of millions of dollars of revenue and you want to you become billion 10 billion hundred billion dollar company you're going to have to hire engineers but if you're trying to build um a company that uh creates a really great living for you even you know you can potentially get rich from it you I think we're almost there where you can do it on your own without any developers and so when I'm talking I'm talking to our audience >> right >> as opposed to I'm not I'm not talking to Microsoft or or Facebook they're not going to replace developers anymore view on developer productivity is that developers are much more impactful than they used to be because a single developer can be so highly leveraged these days and so yes you want to find the best developers and we're expanding the team but the the the scale that is at today we would be 10x that the number of people if we're a SAS company 5 years ago >> wow >> to reach $100 million in run rate um you know five years ago on average you would have like 500 a lot of companies will have thousand people >> how many do you have 80. >> Wow. Okay. You know, it just makes me wonder that as companies grow like this, what the future is going to look like um from the technical side and I'm curious, do the folks who have technical abilities, you know, let's say the economy expands like this and everyone and their grandma can build literally can build a company using um AI tools. Do the technical people then come in and sort of clean up the problems? Are they your like cleanup crew? I was reading this uh funny article and uh publication called Futurism. It says companies that tried to save money with AI are now spending a fortune hiring people to fix its mistakes. >> And it was about it wasn't about vibe coding. It was actually about content like content marketing where like your your content marketing plan is just filled with this like you know kind of bland chatgpt generated copy and half the time it says as an AI assistant this is the message that I would use. >> Google you'll see so many hits. Yeah. >> So I am curious to to hear your perspective on does does the technical field end up becoming cleanup crews for vibe coding gone wrong. Let me just tell you where I think uh technical folks have a um job security today. Um so I think if you're writing software for my Tesla, I don't want you to be vibe coding. I want you to write low-level verifiable code. If you're writing code for uh space shuttle, you're writing low-level verifiable code. Um but also even I mean those are life or death situations. So I think we need we don't need VIP coding there. We need more pre precision. Uh but but even uh sort of large scale platforms if you're building uh core cloud component the storage or virtual machine components on AWS or Google cloud or Azure you want systems engineers that understand distributed systems understand how to uh create failsafe systems at scale. So I think uh engineers there have job security for the foreseeable future right uh because of the problem of stcasticity of these models and and all of that you need you need every line of code to be reviewed and managed very carefully. Uh now where I think AI is going to have the most impact is on product and people build building products they want to iterate on it really quickly. they want to um internal tools, people want to replace all the mess of the SAS software that we have today. Um so I think I think that's happening. Now in terms of the cleanup, I mean depends on where you think AI is headed. Like do you think that um AI is good at making software but bad at maintaining it and it's going to stay bad maintaining it, you know, for the foreseeable future. If it's good at making software, it must also be good at refactoring software or testing software, right? Actually, right now it's pretty bad at testing software because there's this thing called reward hacking. So, when you do reinforcement learning uh over large models, uh you're giving it a reward every time it does the right thing. Reward hacking is the way to so so the models become incredibly goal focused. They want to get that done, right? That's what RL does. And often times what we see when we try to get the models to test things, it will uh start being corrupt in a way. It will like change the test to fit the mistakes it made or sometimes delete the tests. It's really fascinating behavior that uh actually Anthropic published research on. So, but do you believe that's going to be the case forever? Obviously not. like I think over the the next three or six months I I think we're going to see uh machine learning models being able to test and verify their work. >> Okay. So one of the biggest things that this moment depends on is affordable large language models coming from the foundational companies. And that means, you know, in layman speak, um if you're going to want to build with AI code, uh you have to actually have um the ability to bring in models from an open AI or anthropic that are going to generate that code and not break the bank as you do it. And we're still in this, you know, fund VC funded or or investment private market investment moment where we don't really know the true cost of these models. Um >> meaning like the foundation model companies might be losing money on those and they are and the application companies I don't think they on the gross margin basis. I don't think they are. >> Right. But they're also training and that's a lot of money. So they could they're not profitable. They're losing billions a year. >> Of course. Yeah. Yeah. >> And um there's been this thing that's happened recently with I just want to run it by you with both Replet and Cursor where I think um end users have seen pricing gone up. Um, Ed Zitron wrote about this and I think it's a pretty good piece talking about effort-based pricing within Replet and that is uh effectively a a different pricing structure. We've seen um Replet users talk about the fact that they're actually paying a lot more for the same services than they were previously. And his theory is that OpenAI and Anthropic found quiet ways to jack up their prices for startups and we're beginning to see the consequences cuz cursor had a similar wrote >> happen at Zitron. >> Oh, okay. >> Is that what's is that what's going on? >> No, the prices haven't gone down and that's the problem. So, we used to see these, you know, we've seen token prices come down 99% since uh since JPT, and we've seen token prices come down year-over-year. The thing that's a little disturbing right now is that token prices are not coming down. You better believe that the unit economics of the labs are getting better because of economies of scale, because these models are getting easier to optimize, but they're actually um not reducing prices. And so the concerning thing are we reaching a steady state? Is there price collusion? Is there uh now oligopoly of few model companies that are able to create these state-of-the-art models and there's no p downward pricing pressure right uh is the are there investors starting to demand better better business fundamentals? I don't know exactly what's happening. Um, we should we should talk about the Chinese open source models in a second because I think that that will introduce an interesting mix to to to this. Uh, but it certainly is the case that we're not seeing token prices go down. Um, the reason the main reason we went to effort based pricing um is let me explain about effort based pricing. So when we released replet agent v1 version one uh of uh replet agent would work for like 2 minutes at a time. You would give it a message or go try to do something for 2 minutes either succeeds or fails you know gives you a checkpoint uh commits the uh source code and charges you 25 cents. Um and the the reason it only worked for 2 minutes is because the capabilities of the models uh you know meant that it can only work for that long. Now models got better and we we we had we knew that models are going to get better and they're going to be able to work for 10 15 minutes. And so with version two of rapid agent started in beta in in in February came out of beta in April the model would work for for 10 minutes and so we can't charge 25 cents for like a 10 minutes. So what we started to do is came up with a heristics. Every nine tool calls will do a checkpoint. And so as it's working, you'll see it make a checkpoint checkpoint check. That's a hack, right? That often means that if you make a small change that cost us 5 cents or whatever, you still you still get 25 cents. But also, if you make a big change, you might be costing us a lot more than than what we charge you. So it was it was really out of whack. Um now, uh that was a hack and we need to move to a place where we're charging the user proportional to what what how much the model's working and the cost on us. And we think that's the best way to create a a long-term sustainable business. And um when the when those two things are aligned also opens up new opportunities where when we uh do optimizations, we're always optimizing. We actually we actually had like 20% optimization on on cost recently. We pass it straight to the user because because now cost and and price are are are tracking with each other. Um what happened with our community? The first thing that happened is there was a sticker shock. So you're used to seeing 25 cents every 10 tool calls and suddenly you're seeing um $15, you know, or $2 after 15 minutes of work. So that's one. Uh two, it's true for some users who are really advanced, the cost have gone up with for them because the projects are bigger, the contact size is bigger, their their workloads are bigger. Um but early on in the project, it's actually cheaper. You mentioned that you worked for an hour. You didn't have to sign up for the uh the core paid the paid. >> We give free users $3. So you work for an hour on $3. >> Not bad. >> Yes. So >> it's cheaper than a developer. >> It's cheaper than a developer for sure. Um and and so uh that being said, we we recognize that on advanced users, it is now it's almost there's a tax as as you go on. And so we we're trying to optimize the context window and make sure that advanced users are not getting um you know more expensive experience. The other thing that happened is we introduced thinking mode, reasoning mode and we introduced like high power mode. Um and and people are enabling those and sometimes they forget them enabled and now we actually start to hide it under advanced like don't enable those unless you know what you're doing and you want more power and there's like a 5x multiplier on uh on so so a lot of people are enabling those getting these large checkpoints and we're like we put out content we put out a video we put out some documentation or blog post here's when to use reasoning mode and you should you should always have it on. Um so just describing all of that that's happening there's a there's a macro trend in in the application space where a lot of companies were subsidizing the cost of of uh like a lot of companies were paying money more money on thropic and open eye than they they were making. >> Was that were you doing that? We on V1, no. On V2, yes, because the pricing model was out of whack with how we're charging. Um, actually the median cost per checkpoint kind of went up only a little bit. So on the lower end, we're charging user less uh right now. Um, but it used to be that on the lower end, we're charging users more. On the upper end, we're charging users uh less. So now it's more proportional, more fair for for both. Um, and so now we have solid business fundamentals that allows us to grow. And I've been talking about how Replet has been my mission, my passion for, you know, 8 8 nine years as a company, 15 years as a like side project and and a vision. And we're not trying to, you know, rapidly expand revenue while losing money in order to to flip this company to sell it. you know, we've seen all these acquisitions or raised like the next big round. We're really trying to build a business for the long term and Replet is made of all these different components. So, we have cost not just on AI, we have cost of, you know, traditional compute, CPUs, storage, databases, uh all of that stuff. So, kind of to summarize, you know, I've talked a lot about what was happening specifically in Replet. I don't know what's happening in cursor. Uh I think for sure that their situation is like a little different because they uh they their dynamics is I think they actually did raise prices uh for the uh you should talk to them but but I think it's like a little different dynamic than than what happened at Replet. Um to summarize there is a concerning trend where token prices are not going down. Uh, is that going to be the case for the future? Because that sucks because we want to be able to use more tokens to create more intelligence to be able to create better applications for users. Is that going to be the trend forever? Are we reaching a steady state? In cloud, for example, we kind of reach a steady state. When you have an monopoly, there's no uh pricing pressure. But you when you also have an oligopoly they uh not intentionally without talking start colluding uh you know because it's it's like a market dynamic where it's like if you don't lower real price I'm not going to lower my price. It's not in our sense of as a whole because we own 25% each of the market. Right. >> Okay. I do want to ask you about something that you didn't mention when you looked at the different factors for why prices might not be going down. There might be investor pressure. Uh there might have been this equilibrium reached. Or is it possible that these models have just gotten so big and expensive to run that the fundamental economics of AI are just not working? So explain why. >> Um uh you can surmise the bigness of the models based on speed token token throughput. It's not perfect but but if you remember GPT4.5 GPT4.5 uh was an experimental model from OpenAI. It was the idea let's train train a training parameter dense model meaning it is not sparse meaning the all the token all the neurons are activated on every request and it was so slow it's really hard to run these things. the new models, even when they're big, they're sparse models. They're called, mixture of experts. So in every request, there's a router layer that takes it to the expert part of the circuit in order to answer that question. So you know, there are models with trillion parameters, but any given request is 32 billion active. And that's like a kind of small model. Um, and what we're seeing based on speed and things like that, it's actually probably the models are getting more efficient. I mean, Deepseek showed that the models are getting more efficient and if you know, Deepseek open source was able to make it, you better believe that the labs are also making more efficient. >> Okay, I do want to speak with you about DeepS and Kimmy K2 and other Chinese models. So, let's do that when we come back from the break right after this. >> Cool. And we're back here on Big Technology Podcast with Amjad Msad, the CEO of Replet, talking about all things AI, code, vibe coding, and now let's talk about these Chinese models. So, this episode will air a couple weeks after the emergence of Kimmy K2, but we're talking about Kimmy K2, which is another Chinese model. And of course, this deepseek moment was a big moment where we found out that this seeming uh small hedge fund in China with some GPUs was able to engineer a more efficient model. That story will be debated about what actually happened for a long time. Um, but let me ask you one influence of deep sea question and then we'll get into the others and Kim K2. So you mentioned before the break that western models have um taken after deepseek. So, do you think they learned what DeepS did and sort of put those new innovations into play in their own models or was that coming anyway? >> From what you've what we've seen from the Twitter sphere is that it seemed like there were some surprises uh cuz researchers just talk a lot. It seems like there were some fundamental innovations from the deepseek models that that weren't known in the in the West. >> But have they implemented those now? And that's probably why we're getting more >> models like um >> yes I I'm sure like the models are getting more powerful without going getting slower. >> All right. So tell me about Kimmy K2. When Anthropic came out with uh Sonnet Claude 3.5 that was a u fundamental shift in the industry where uh the models got a lot better at coding and suddenly instead of making small snippets of change uh sonnet could could generate entire files and enabled things like cursor composer where it's it was a start of vibe coding where you can put in a prompt and generate entire files and all of that or generate large edits. Then uh sonnet uh 3.5 v2 was the first model. It was a computer use model was the first model where you could sense that there's agentic true agentic behavior. I don't know what they did. They cracked RL whatever happened there. You can give a model a VM and it can give it >> virtual machine >> virtual machine. You can give it an objective >> and it can sleuth around in the virtual machine look at the files do run some commands and >> um and then write a program test it and um and then solve solve a problem. that experience there's a benchmark called SweetBench software engineering uh bench um and you start seeing the score going up dramatically. I don't know I think we were at like 10% last year and now we're at like 70% and 80%. 80% >> world class coding. Um it's such it's the the interesting thing about sweet bench is not just coding because there are other benchmark that that just do like the code generation right sweet bench I think the harder thing about it is the agentic workflow is writing the code testing it running commands finding files understanding files and this this stuff was like a huge um jump that happened with with sonnet uh 3.5 v2 then 3.7 then 4.0 and they've, you know, kudos to to Anthropic. They've been able to make create a lead that hasn't been bridged by the other labs. Gemini is getting there on the Agentic stuff, but I would say OpenAI kind of lagged behind. Uh 03 has some interesting Aenteic capabilities, especially around deep research, but it it it hasn't been uh as good as the other models on this agentic stuff. I mean they did some interesting stuff with codeex. I don't know if those models are are in the API but everyone is using uh claude for the agent coding experience. The interesting thing about Kimmy K2 I would say is they c caught up not to clot sonic 4.0 perhaps clot sonic 3.7 at least that's the vibes right now um before the other labs. >> Wow. You know I think that's really under reportported. Again this is vibes. Everyone's starting trying to figure it out. But it looks like it has a really good sweep bench. It is doing 65 on sweep bench. Sonnet is 72 72. If you do sampling, which is you for every step you ask the model to generate any number of solutions, you can get up to 72%. It can be competitive with with sonnet. >> And this is with export controls. >> Yes. And I think in the paper uh they talk about the solution is scaling uh reinforcement learning. We also saw that with Gro 4. Gro 4 spent as much on reinforcement learning as they spent on on pre-training which is unheard of. But even but that's an important point because with that big spend on reinforcement learning Grock is a competitive model but they spent a m billions billions hundreds of millions on RL >> I don't know >> which is this goal setting form of training and it's not like it's a new category so it shows there are some limits and >> Xi is an amazing team uh and they they've been able to achieve so much in so little time but it's also well known in the industry that they're computer inefficient they're so comput that that they're throwing computer at the other the problem in many ways. Yeah. >> So what is the significance that Kimmy K2 is now as good as some of these anthropic models? a small research lab. I think the rumor is like the 200 people. Again, there's expert controls as well. Uh was able to figure out um how to catch up to near state-of-the-art agentic coding models before big western labs that are highly capitalized, a lot a lot more researchers was able to. And does that mean then that they can undercut them on price or >> So let's see. >> Right. So let's see. >> Are you going to integrate Kim and K2? And >> we're looking at it. We're looking at it. There's a lot of >> So far we're impressed. >> Okay. >> So far we're very impressed. >> So I mean look these things sometimes they overfit to certain things and I would say it's like requires a month from the entire community to kind of like really have consensus over like whether the model is really great. Um and similarly with rock for I think a lot of people are playing with it. Um but but my sense is that it is good enough and again the economics are so good that you can expend more tokens uh to get more intelligence. Though um it is not at the frontier but it is near frontier but given that it's cheap and fast enough you can spend more tokens that that creates some more interesting potential for us to create new capabilities in our platform because it is cheap and fast. >> How much cheaper is it than the anthropic models? And I um am bad at this, but like I would say I don't 1/4th maybe. Um >> Oh yeah, that's that's on the official API, >> perhaps more even. I I forgot. Maybe you can look it up after the show. >> We're going to have to re this show is going to come it will come a couple weeks after we record, but we'll have to release this segment early because that's >> Yeah. >> astonishing. I want one more question about anthropic. Uh I can vibe code and claude. uh and do it all the time and um they also have this claude code product where people are you know writing prompts getting code are they your competitor long term or how do you see them on that front because that's the question is eventually do the labs just subsume everything else that's built on top of it >> I think the question is is for them right like you should ask I know you're going to talk to to to Dario you should ask the question >> listeners viewers this will air a week after Daario but I'm about to after this go in and speak with him. So, >> you might see this question a week earlier. >> Yeah. So, look, we're we're uh committed to our relationship with Anthropic. They're a great company to work with. Uh we have a great partnership. Um and it it's not like we uh we didn't anticipate them wanting to build products in addition to to the models. Every model company is building products right now. uh the thing that they're going to have to manage is their their pricing. If they're if they're going to compete by undercutting everyone on price, they're going to destroy the ecosystem, right? Uh I think Replet right now is um uh has the advantage of this platform that we built over eight years that it's going to take a lot of blood, sweat and tears to build and also the user experience that is focused on on on that sort of non-technical user and um like we really care about this this idea of empowerment. Right now, cloud code is is used by developers and loved by developers and I think they're competing head-to-head with cursor windsurf and those kind of products. Um whether they're going to move into our space again, you should you should ask them about that. But I think a more interesting question um how how uh how do they want to nurture the ecosystem um versus just go go and because they can compete on price they can steamroll everyone >> right I mean cloud code is the max package is $200 a month and you see developers getting thousands of dollars of API value out of that >> not you must notice this >> yeah not good for the eosystem system. >> I don't think so. >> Why? >> Because again, you're you're competing on on price, not how good the product is. And there's a uh there's a price at which maybe the quality doesn't matter as much as how many tokens I'm getting. Although Cloud Code is a really good product, but but then, you know, Cursor, no matter how good they make the product, they're still going to be more expensive and at a disadvantage. And and people are like, well, you know, I really like cursor, but like I can get 10x more value out of clot code and so the marginal uh gain in product quality will not matter as much, >> right? >> And that will that will destroy the ecosystem. >> Fascinating. I mean, I think that this question is just one small question or one version of a big question we're going to be asking as these AI models get bigger and better and more intelligent. So I want to I want to spend the rest of our time talking about some philosophical questions if that's okay with you. >> Sure. >> Um there's this idea that um the AI research houses want to use the code that they generate to sort of or these coding applications to speed up the development of the next model um and compress the time it takes to get better models. People call it an intelligence explosion or things of that nature. Do you see that as feasible and is that something we should want? >> So, uh you should think about what are the limiting factors to uh the next version of a model. What are the uh bottlenecks? Where's that innovation need needs to happen? I can think of a few few areas. Um one is is research. Uh so this is algorithmic uh research like figuring out the next algorithm next improvement in and um in in in in training algorithm and in inference algorithm whatever it is. Uh and then systems engineering these training runs are massive that requires a lot of interesting distributed uh systems engineering. Um will uh AI coding help with AI research on the margins? Perhaps they can like they can spin up Python notebooks faster. Uh I don't think it's that impactful like the models can't do AI research can come up with ideas and test them really quickly. Will it help with distributed systems? Perhaps it is not as impactful right now on writing Rust code or C or go whatever as it is on JavaScript and Python and higher level languages. And like I said, it requires a little more precision and uh better system design to and that the bottleneck to really good distributed systems is is design and not like the the amount of number of codes you can generate which is more true on the product side. On the product side, you're just you need to generate tons of CSS and JavaScript and try a lot of things and delete a lot of things and iterate and do AB tests and all of that stuff. So like volume of code is important there. I would say on the on the backend distributed systems I don't think volumes of code is. So I'm reasoning in real time now and I guess my answer would be I don't think >> it's going to have anything more uh than uh you know uh you know marginal improvement on on speed to to the next model. >> All right. I guess that makes me rest a little easier then. Um, by the way, just on a uh, you know, you speak with a lot of people in the AI industry. Of all the economic activity in the AI industry today, how much of it do you think is code? >> Just someone uh, someone actually made that slide that's been going around and I think it was something like 1.1 billion of ARR is in the uh, AI coding and VI coding space. >> Okay. So, it's actually kind of small compared to like the total >> well so revenue. >> Yeah. So, Anthropic has four$4 billion, >> right? uh let's say >> yeah $4 billion ARR let's say um they have also have their own products their own coding products I don't know let's say 1.5 billion is is off of that is is AI coding it's substantial but it is not the entire thing >> okay >> uh but but then you have $10 billion of AR on on on on open AI side and that's more consumer >> now on the rush to artificial general intelligence which we've talked a little bit about um do you think Silicon Valley is the one that should sort of possess this or be the one that controls it. I mean, there it's an interesting place. There's a lot of kooky ideas here and it seems like if this is possible, it's going to be something that's controlled by or owned by one or more of the labs here. Is that good? assuming it'll happen and assuming one company will reach there first and have some kind of advantage or monopoly over AGI which I'm not entirely sure I agree with these assumption but if you want to make if you want me to make these assumption and then answer the question I'd be happy to but I just want to make it clear that >> yeah let's make those assumptions okay I know there's a lot of things that need to happen in order to get there >> yeah I might have some fundamental disagreement with this assumptions but let's >> talk through the disagreement I I don't think AGI is any point in time for one. Uh and I think there's going to be right now the uh the distance between any lab is is just an order of few months on anything that really matters. You know, between 01 preview and and deepseek was like two three months. um between I mean the biggest one was was this Kimmy K2 one that we just talked about that that was like maybe nine months or something like that but it's still sub one year >> um and so whomever reaches AGI first they're not going to go into intelligence explosion and and just like suddenly you know super intelligence gets born people you know other labs will like catch up really quickly and and then you know there's going to be a lot a lot of models I don't think it's going to look that different from the ecosystem that we have today and if you assume that AGI will actually have an impact on model development through research and speed of development then everyone will get the benefit of that as well and so actually you might have get even more competition once you once you have AGI so I don't think it's going to be a monolith >> okay but if it is >> okay if it is uh is it would I want Silicon Valley uh I guess it's like a moral uh Yeah, philosophical >> philosophical question. I wouldn't want any human being to we're all fallible. That's why markets work. That's why um that's how a human society evolved over time. It is, you know, Darwinian evolution and uh free market capitalism. It's all based on competition and um and and and the idea that like one system would be this this monolith controlled by one human being. We've seen disasters and massive human suffering happen when there's this top-down sort of leviathan type thing whether it is uh in in in Soviet Russia with with all the deaths that happened there or or um in uh China or whatever. And oftentimes like in in as I understand it in the in the Soviet era they they had this kooky idea about uh evolution I think what was it called? Um lenko lucenism or something. >> I'm not familiar but I'd love to hear the explanation. >> Yeah. So basically they had they thought that evolution is this bourgeoa idea. You know communism has this this idea is like anything that's you know high class bourgeoa is like wrong. And so this had this ideological view on how evolution works or should work that led them to do agriculture in the wrong way and led to famine and and that sort of thing. And like um and so uh often times they do kill people and cause mass suffering, mass poverty. uh even if they don't intend even if like outside of the gulaks and all the other uh oppressive uh explicitly oppressive system those systems are inefficient because they they have these wrong ideas and there's no competitive pressure to have better ideas um and so that's fundamentally broken um static system that doesn't improve like competitive systems and I think if we or have a super intelligent uh monolith controlled by a single company or single human being. Uh it's bad. It's fundamentally really bad. I agree. All right, last question for you. Uh we're seeing a lot more AI love bots uh come out. Is that a good thing or a bad thing that people are going to fall in love with AI more often? >> It's a bad thing. um like app priority bad thing like um the the the reason humanity grew and and and flourished and all of that is because we have babies and uh anything that that you know you know takes away from that uh especially given the fertility rate is so low right now is is is is will will potentially lead to really massive problems Especially since capitalism is based on large uh middle class consumerism like the the the the the current instantiation of how the economy work requires that uh requires taxpayers to fund social security and like elder care and and all of that. The welfare state is based on this large young population. And when that starts to collapse, you're going to have, you know, massive instability in these in these systems. So even if you know, humanity doesn't go extinct like like Elon would say, although Elon is is is the first person to create a really interesting mass market companion, I think, right now, >> interesting is a fun word for it. >> Um, it looks like it's really compelling. I see people, you know, right now talking about on X so much and looks >> it's got some work. Yeah. But it is these type of things are going to definitely become real partners to people. U people when this technology has been bad or hardly workable have gotten married to them >> right >> before LLMs. So it's going to happen again and and in greater numbers. >> Hey the question is I wrote this uh I used to do like more creative creative writing. Um I I wrote this uh this uh essay on the hyperreal. So I think it is like French postmodernist um theorists like Bulgiard uh wrote about this this concept of the hyper hyperreal. Uh and the idea is like we have reality like you and I are interacting right now and then you have media created realities and the reason sometimes it is hyper real it is more intense than reality itself and more enticing than reality itself. So uh you know even in real things you know for example um when you when you get a when you eat like a I don't know Twinkie or something like that like fatty salty sweetie kind of a snack it is like it it is not like a piece of chicken or beef or whatever it is uh it is this hyperreal thing. It like hyperengages your senses and it makes you addicted to it. And similarly, social media is hyperreal in a sense that I can go get go there and get get a lot of social interaction, tweet something, get hundreds of likes, and it's like much easier than going out in the wild and like uh finding 100 people that could like me, >> right? And so we have these um technologies that are and the market around it that that is bootstrapped to make us addicted uh uh because they're so so much more enticing and loweffort than uh the reality that we know and and experience dayto-day. And I I think that that is a huge danger uh for for the existence and evolution and and longevity of of human civilization. Uh and uh and I think it is uh you know I talked about how good free markets are, how important how how competition is important. This is one thing that capitalism is so um adversarial to humans at right and and so I don't have a solution for it. I think in the past the solution was religion for example in like Islam you can't depict humans or animals in art. That's why in Islam the the art uh became more uh geometric and if you go go you know visit like the mosques or whatever they have like all this geometry that or like calligraphy that's really interesting um and I think part of the the idea there is is uh is is I think the hyper real like if the the ultimate expression of a um a something so enticing is a virtual being like we're we're seeing right now. And I'm not saying like, you know, Islam had like the the foresight or whatever, but I think it's, you know, religions used to have this built-in mechanism to protect against these predatory um sort of um consumer products. Uh and I I I wouldn't know how to solve it in the future, but perhaps it is it is potentially societal, maybe governmental. I'm always kind of skeptical of that or um or religious uh kind of protection. >> We're going to need something. >> Yeah. >> So, Lord help us. >> I'm Chad. Great to see you. Thanks so much for coming on the show. >> My pleasure. All right, everybody. Thank you so much for listening and watching. We'll be back on Friday to break down the week's news. Until then, we'll see you next time on Big Technology Podcast.