CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner
Channel: aiDotEngineer
Published at: 2026-05-13
YouTube video id: VktrqzQgytY
Source: https://www.youtube.com/watch?v=VktrqzQgytY
[music] >> All right. Can you all hear me? Great. There we go. All right, well we're only 35 to 50 minutes late, but thank you for sticking around. Um we're going to talk about why CICD is dead. And we're going to propose that continuous compute is going to be the next thing. Maybe. All right. So just quick introduction. We're going to have two speakers. One's already One's getting mic'd up. Um my name's Madison and I'm a partner at NEA investing in technology. Uh I do focus in infra and dev tools and I formally used to be a meta AI researcher. Uh so I used to lead data and AI teams. I got really frustrated by the state of infrastructure and so I jumped into venture to do something about it from the top down. And then I'm also going to uh introduce on behalf of my uh partner here, Hugo Santos. So he's the CEO of Namespace, which is building high-performance compute infrastructure and at this point what we believe is going to eclipse the new CICD wave. He also formally led microservices at Google. Yeah, um great to be here with you folks. So we're going to talk about why agentic software is breaking traditional CICD. Obviously, we're not going to get through this today, but the point is on the left side what started off in in agentic software was really monolithic agents. Uh we were really using the LLM as one engine. But now we're moving into the right side, which is microservices uh with agents. And that's how we really need to think about software development in an agentic world. So the life cycle um is very fragmented. This is quite a mess, right? We've We've really kind of brought together all these traditional CICD systems, build, test, deploy, devops, um but we also now have new IDEs. We have autonomous agentic engineering solutions. Um and then we have our traditional devops in the middle, which we believe is really going to innovate in the next year. So let's explain why we think it's dead. So first, how do CICD pipelines work today? Well, we all know human developers are currently submitting one maybe maybe a couple of diffs when they're just writing it themselves. And those PRs then take your colleagues a bunch of time to review. Then you have to go through GitHub actions and run build, test, and deploy steps. And then finally, you're addressing those failed test cases and maybe you're iterating on the diff. So in that in that scenario, it was really uh just one or two a week. So now, how do we think about this at agent scale? You've got agents using the exact same systems, but they have, you know, N number of PRs, maybe N number of repos. Still takes a similar amount of time to verify unless you're using review bots, which gets a little crazy. Um and then we correct those failed cases just like we did in the past scenario. So what ends up happening with a human? Pretty predictable. Um and you've got local caches, which are often warm. With an agent, this starts to get really complicated. You have thousands of short-lived branches. It's all trying to pull the same codebase in a few different directions. You start to get to a point where merging all these different versions together is really impossible. And that is where we start to have a huge problem. So let's look at in real time, GitHub activity has gotten absolutely crazy. The white line here is the actual number of commits in the last couple of months. And then the number of uh lines added versus deleted. I mean, this is just an unbelievable spike. So how do we start with replacing CICD? Well, the starting point should be at the acceleration. So obviously right now, I know a lot of you are struggling with very slow build, test, and deploy times for your CICD solutions. This is a very common problem. Um but where we're headed is being able to first speed that up by inserting over the existing GitHub actions and other um underlying infrastructure for CICD. So that cache is really going to become the orchestration layer in this scenario. And this is really critical to do through a hardware and software co-design. So what does this start to look like and how does this start to eclipse previous CICD? So first, we have our intake um which requires ingress shaping and rate limiting. Then we move to our cache and this is the next big step. Um how do we think about orchestrating and making sure we're routing to the right infrastructure? Uh from there we can even move into agentic identity for software and thinking about uh retries at scale. And then if you don't believe me, let's ask the experts. So Mitchell Hashimoto, one of, you know, the the coolest devrel at scale, he's also the former founder of HashiCorp, uh wrote exactly what he would do to fix GitHub today. And a lot of this uh has to do with even shutting down copilot, thinking about how do you actually just evolve GitHub to be first in the cloud era, but second, um actually really enabling um inference at scale. And then we've got a number of other data points on the left-hand side um that we need to be able to serve AI and agentic users first um or we die. >> [laughter] >> And thinking about friendly code storage uh solutions that may also help. So there there's a lot of frustration around existing CICD, but this is really just the starting point. We've only just started to see agentic software take over. So now I'm going to pass it to Hugo to talk more about what a real solution can look like. Yeah, so um I'm fortunate and me and my team, we we spend a lot of time with companies today that are going from how traditional CICD look like into how we think it's going to look it into the future. And uh giving a little bit of a hint, it's it's agent all the way down. So we work with with companies like Fall and uh Zed and Ramp and many others that are really at the forefront of uh everything around development. And you you probably recognize yourselves uh in between these two um bits where uh up to 6 months, humans were writing all the code very slowly uh and some of them actually fairly quickly, but in hindsight fairly slowly. We package uh all these changes in PRs. We do validation as part of those PRs. And uh behind the scenes um the machines are a little bit slow, but all of that is hidden behind the human latency. And uh many of you might already be seeing a bit of what's happening today where code generation is very cheap, work is much more continuous, and and that kind of forces the evaluation to go into the inner loop. Um so what you might not realize is up to this point uh you as a human, you are the agent. Uh you have a stop on mind, here's what I'm trying to accomplish, and then I'm going through all these phases. Uh okay, I submit a pull request. At the pull request within your team says, well, you didn't quite follow the right format. So go back to the beginning. You're in a loop. Uh now your changes uh are in the PR, the tests are running, they fail. You need to go and change something in the code. You're back to the loop. Uh a human reviewer comes back and says, well, you know, you didn't quite use the right API, please go and change it. You're back in the loop. And then when you go and get your code, you're you're finally done and you go and merge it, uh the merge queue says, well, you know, some another colleague managed to get some code ahead of you and you have to go back in the loop. And uh when you're at your human scale, this opportunity to merge, the time that you go from when you're working on the code until the code goes into into the repository, um can be large because there's only so many changes that you're doing at the same time. But as you accelerate, this opportunity to merge is really really important because the rate of change increases dramatically. So uh we talked a little bit about this like the PR um is is kind of used as the unit of work. And uh that's what really designed for human review. It's it's it's it it expects a bit of delayed feedback. It's it's expected to to go into kind of these street handoffs where you send it over to the reviewer and then it comes back. Uh CI matters because it's kind of validating the work that you're doing. Uh it's doing things like, well, are you introducing a regression? Uh are you compiling and building your code from well-known source? Uh are there other changes that are going on that would be conflicting with this change? Uh is this change allowed? So all of that is kind of part of this validation process that is automated. Um human reviewers are overwhelmed. You've heard this many times. I don't have to repeat it. And the interesting thing is that this the the act of merging um is starting to look a lot like um high-performance uh database problems where you have serialization and you have a single ledger where every single change needs to go in and you to lock the the database in order to be able to commit. And the time that you have to lock when there are humans is large, but when there's machines, it's short. So, the time to merge really matters. We need a new architecture. Uh this is already how our team is working today and how we see a lot of the companies working today already uh that are at the forefront. Uh there are no PRs. Uh we start with intent and plan. This is what we want to achieve and we codify it. That's the spec. Someone writes it down. It might be in a linear ticket. It might be on Slack. It's somewhere. Somewhere you have written down what is the goal. What are you trying to achieve? That goes into a loop and this loop is a typical agent harness. So, it might be your might be your cloud code. Might be We're we're big amp flat fans, so in our case it's often amp. Uh it might be cursor. It might be factory. Uh you you go into a loop and here uh the agent will check out your code and will start kind of moving towards the and implementing your plan. Uh very importantly, already makes use of some of these invariants. Well, it checks out a well-known commit. So, it doesn't just start from from anything, for example. Then, internal What is internal validation? Well, it goes and uses the assets that exist in the repository to actually validate that the change is correct. So, it builds it. It has tested. Then, it comes back and tells you as a human, uh "Hey, I just finished. Does it look good? Should I change something else?" And you say yes or you say continue. Like, continue is probably the word that we use the most nowadays. And and then it just goes back and continues through the plan. Eventually, you're done and you go into the merge queue and and then it goes into the ledger. So, your repository, your Git repository is is kind of like a ledger. Uh this is fast, but it's not fast enough because in this external validation, you still have a human in the loop. Uh so, where do we think we're kind of moving towards? And this is in the span of weeks to months, not years. It's it's a world where generating code becomes much faster. It's already fast, but inference will only get faster. Uh internal validation, so running your builds and tests need to be extremely fast as well. And that's where you cannot go and spend 15 minutes running your tests or 45 minutes or any sort of minutes because you are delaying the whole loop. And external validation no longer has humans. We have other agents that are evaluating the changes. So, you may have um a security uh focused LLM. You may have an uh uh API conformance uh based LLM that is providing feedback within the loop to the changes that your main harness is then uh incorporating back into the code. And it's doing this very quickly. Um when it's done and in order to do it very quickly, it actually needs to be uh running in a stateful environment. Memory is important. State is important because if you're starting things from scratch all the time, you're just going to delay things even further. So, the statefulness of it is really important within this loop. Um you are getting world signals from time to time. Things like, well, the plan changed or someone else got the a change in. So, the harness is also adapting its intent and plan, which then creates a new loop. And then when you're done, because there are so many changes going on, uh and you haven't yet really as a human, the team hasn't accepted this change, you don't go directly into the repository, you go into a pre-queue, which we're starting to call a pre-merge, where there's a queue of changes that are done. They would have been merged uh if we if the process of merging was fast enough. But the reality is that you will have so many of these running in parallel and operating on the same parts of the code base that you need a process that reconciles them so that you can have serializ- serializability. So, that you actually can guarantee that all the changes go back to back into the into into your ledger, into your repository. And that's the point where you get external approval. That's where the human comes in, where looks at not the code, but that the this was the intent and this was the result. And result might be Here's the video of the feature working. It might be uh Here's the uh the output of the security focused LLM on on this particular change. And it's not on one commit or one PR, it might actually be on multiple of them. So, you may even have multiple agents uh independently working on features that go into this pre-merge queue and semantically get grouped into something that you as a human can manage because there's going to be way too many. We already see that today where within our team, where our our volume of what we would call PRs from in from the past is four times as big as before. It's impossible for a human reviewer to look at every single PR. Um and if we think a little bit more into the future after this, if this process is extremely quick, one thing that may end up happening is that you may have to step into the multiverse, where uh the starting point where the intent and plan gets applied is not the tip of the ledger, it's not the latest commit uh that of your repository because that is moving. There's many candidates. So, the agents may actually be working on multiple commits at the same time to address the same plan. And in order to get that uh to get there, this inner loop needs to be extremely quickly uh extremely quick. And um it adds up in terms of capacity. So, resource usage will also blow up because of all the candidates that you're going to be exploring at the same time. This is the world that we think that we're moving towards. Uh we're uh obsessed about performance and efficiency, so we're uh spending a lot of energy finding ways to maintain efficiency within this loop. And part of it is uh well, don't do work that is not necessary. Don't start things from scratch all the time. Uh have agents work a lot more as we did as engineers that in our own work stations that were much more incremental. And and that's kind of the world that we're moving towards. Uh did CI go away? Well, CI still matters, but it's just shifted because the principles of uh well, for example, does does the code actually work? No longer is a separate phase, but it's just part of this loop. Every single iteration is going through validation now. It's still going through enforcing those invariants as well. Like, you still have want to have, for example, for compliance reasons, you still want to have guarantees that you're starting from a well-known uh checkout, that you don't have someone in the in the in your company that came in and added other code that it was never vetted and you're starting from there. So, those invariants need to still be enforced, but they enforce on a continuous basis. Coordination moves away from CI. So, CI no longer has to uh kind of guide different changes and making sure that different tests are passing in order for changes to be committed. That needs to be part of the overall loop. And governance is still important, uh but it also gets much more lifted into the harness and how the harness is uh uh coercing the change towards following everything that your team has codified um within these processes. And that's it. This is where we we we believe that the world is moving towards. Um if you're interested about this topic, uh us at Namespace um uh spend a lot of time thinking about it. Others others folks in the industry as well. Uh it's a crazy world and we need to be ready for it. Uh thank you. And uh yeah, let's go for lunch. >> [applause] [music]