Collaborative AI Engineering: One Dev, Two Dozen Agents, Zero Alignment — Maggie Appleton, GitHub
Channel: aiDotEngineer
Published at: 2026-04-26
YouTube video id: ClWD8OEYgp8
Source: https://www.youtube.com/watch?v=ClWD8OEYgp8
[music] >> Okay, we all good? Right. Uh so yes, this talk uh is called uh one developer, two dozen agents, zero alignment. Uh this is the case for why we need collaborative AI engineering. So first, a very quick intro. I'm Maggie, I work uh at GitHub as a staff research engineer. Uh at least that's my title. I'm actually a designer back when that was like a separate thing to engineer. Um and next is the Labs team within GitHub. So we work on kind of more experimental, risky bets than the rest of the organization. We like to call it the Department of [ __ ] Around and Find Out. Um and like everyone else, we are of course trying to shape new developer agentic tools. So, I think this is what many people think peak developer productivity looks like right now, right? This is like a wall of terminal-based coding agents all running in parallel on one person's machine. I like to call this the one man, two dozen Clods theory of the future. Uh so the promise that we're given here is that one person with a fleet of agents will do the work of an entire team of developers. The main problem with this stream is it assumes that software is made by one person. All of these tools are single-player interfaces and they focus on scaling up the work of the individual. But there is limited value in scaling up one individual. Because software is not made by one person in a vacuum. It is a team sport and everyone building it needs to agree on what they're building and why. Believing individual productivity leads to great software is nine maybe nine women make a baby in one month logic. Uh more individual output doesn't solve problems that require communication and coordination. It makes them worse. An implementation is rapidly becoming a solved problem, right? Probably everyone here believes that. Uh writing code is now fast, it's getting cheaper, and quality is going up and to the right. The hard question is no longer how to build it, it's should we build it? Agreeing on what to build is the new bottleneck. So everyone on your team needs to be involved in asking, are we making the right thing? Are we spending our energy in the right place and how do we have the most impact? When production is cheap, opportunity cost becomes the real cost. You can't build everything and whatever you pick comes at the cost of everything else. Anyone who ships software on a team knows that this isn't a new problem. Um alignment has always been a bottleneck, but agents have made the cost of not being aligned as a team much, much higher. What makes it worse is that all our coordination tools are still from another era. So GitHub, Slack, Jira, Linear, and the like are as they currently stand are not designed for the agentic development world. We are funneling masses of agentic outputs into platforms that were built for an outdated way of building software. Um here I know like I work at GitHub, so that might sound heretical for me to say. Um but I promise it's not controversial. There are very few people internally who believe that the PR and the issue are the future of software development and there are lots of us inside the machine trying to explore what comes next. So this is how the development process used to look, right? We had a planning phase, a building phase, and a review phase. And we had all of these touchpoints of alignment along the way. And it was slow enough that we had time for conversations in Slack and Zoom meetings, comments on issues and draft PRs so you could discuss the details. And everyone could give their two cents and get advice from expertise across your team and seniors and catch mistakes uh and course correct if things were going wrong. But by the time the code is reviewed and merged, the whole team had seen the work right happening and they were roughly on the same page. But that implementation window has now collapsed. And because implementation is no longer as expensive and time-consuming, we think we don't need to plan as much. So most of those early touchpoints actually disappear. And we know the review time for generated code is actually increased. So that creates more points of alignment, but they're actually on the wrong side of the implementation. The time between logging an issue and an agent opening a PR is now a couple of minutes. The code is so cheap that we don't properly stop to think before we prompt it. Unhelpfully, most coding agents also have this local plan mode that is completely unshared with other people. So you're not even your team on whether the plan it made is good before you ship it, if you even read it. And so we lose even more alignment points. This leaves the weight of all that alignment to sit on the pull request. All those checkpoints now come after the implementation at the end of the process when it's too late. And it's never what PRs were really designed to do in the first place, so they perform poorly at it. None of our current tools give teams a shared space to discuss plans, gather the right context, and work with agents as a collective. We're all experiencing the repercussions of this. Going fast without good alignment leads to wasted work. So this is like features no one asked for and that don't actually solve real problems. And receiving critical feedback after you finish something that ends up meaning you have to toss the whole thing out. And also coordination debts. This is when you get really hairy merge conflicts because agents will touch in the same files or developers even doing duplicated work because they both picked up a thing and tried to finish it in one day. Um or as we all know, we all have giant stacks of PRs to review that nobody has any context for and don't even know what's in them. So, how do we solve this? We need tools that help everyone on the team align before the agents start working, not after. That alignment needs to happen constantly alongside the implementation. Planning and building are no longer separate phases, they are now a cycle. The tools of the future need to bring planning, context gathering, and decision-making, and development underneath one roof. This is especially true because most of the context that you need for alignment and to build the right thing is not in the code base. It's in people's heads. The business context and the financial resources determine what the correct thing to build is, the political dynamics of who's in charge and who gets to make decisions, the product vision from leaders, the user research insight from designers, and the organization's history and what you've built before. These all matter immensely when you're deciding what the right thing for your team to build is. And the agents can never discover this context on their own. You need a way to get humans to share it early and naturally without adding process and overhead. So all of this has been very clear to us on the next team. Um and we've been building a new research prototype that explores how we might solve some of these problems. It's called ACE, stands for Agent Collaboration Environment. Uh it's not a primetime product yet. So like if it looks pretty rough around the edges, it's because it is. Um we're about to go into technical preview and we're going to use it to test it with a few thousand people. Um then we're going to learn how people collaborate in it and iterate from there. So here we are in ACE. It probably looks pretty familiar. We're not reinventing any more wheels than we have to. Uh it looks a bit like Slack, GitHub, Copilot, and a bunch of cloud computers had a baby. So we have our sessions list here on the left and sessions are where you do work, right? It's a multiplayer chat, it's like a Slack channel. I have team teammates in here and I can talk to them about the work we're doing, but I also have my coding agents in here. Each session is more than a chat channel though. It is also backed by a micro VM. So a sandboxed computer in the cloud on its own Git branch. The changes we make in each session are isolated, so we can work on parallel tasks and instantly switch between them. If I want to tap one of my teammates on the shoulder and get their thoughts on a feature I'm building, nobody has to stash their Git changes and like pull down a new branch or like wrestle with local work trees. I just jump into their session and I see what they're doing in a click. And this includes their entire prompting history with the agent, so I have the context about how they got to the current outputs. Just like a local machine, I can run terminal commands in this session. Here I'm going to run bun install and bun dev to get my current project running. I'm going to see in a minute uh my live preview in the browser on the side is going to pop up when I open the port. Um the demo project we have here is a calm version of Hacker News, so it only shows you top three stories from the last uh top stories from the last three months, which is a bit more chill than every day. Um and I'm going to ask the agent to change the color theme to purple here and you'll see in a second it instantly appears in my preview. Right, it's just running the code. Uh and the agent has also made an automatic commit for me with a nice commit message and I can open the diffs and see the diffs, all kind of standard things you would expect from code agents. So let's say we want to do some real work. Uh I have my teammates here in this session with me. Uh and I'm going to ask ACE to add some additional color themes to my app. We're going to um pick which model we want to use and obviously it's Opus 4.6. And then ACE is going to get started. Um we also have this handy summary block in the top right-hand corner. Uh this keeps me up to date with the latest changes in this session, whether they're from me or someone else, which means I can switch between lots of people's sessions that are running in parallel and always stay oriented about what's happening so that you don't get overwhelmed with the amount of noise and activity. But the more important thing is I want to talk to my teammates, right? I want to discuss what changes we're making. So I can ask them what they think of the current changes. They can spin up the dev server themselves because remember we're all working on the same computer in the cloud. This is no problem. We can all see the same preview. We can all write terminal commands and see the shared outputs. No one is going to say this doesn't work on my machine. So my teammates, Nate and Dawn, they're jumping in here. They've taken some screenshots. They're suggesting some alternative features, asking questions. And now what we're about to see is that Nate is going to ask the ACE agent to make changes in a minute. Where is Nate? There's Nate. So, he said Ace let's add a teal theme, too. Um so, I actually kicked off this session, but Nate is now prompting the agent. So, this is truly multiplayer. Both of us are sharing this coding session. Uh the agent can also read our whole conversation. That is all input to the prompt. So, we can talk about things up ahead and just say at Ace, do it. They'll go do it. This kind of accessible Slack-like interface means that um access to a coding agent is um brings everyone in who's creating software. So, not just developers, but designers and PMs and customer support people can all be in the same conversation seeing what's happening in real time as a feature gets built. Because if you're thinking, you know, like why wouldn't we just use Slack for this? I think it's because Slack is never going to become a fully featured software development tool unless they sincerely pivot from their current business. Um so, it's never going to have the right primitives and I really doubt it's going to add them. You know, diffs and terminal commands and that sort of thing is not Slack's business. Um we wanted Ace because it's explicitly designed for software development, but it's much more welcoming to other team members than your terminal. Anyway, we're back to shipping our changes here. We like how this looks, so we're going to create a PR. Um because eventually all this code does have to go back to GitHub, right? So, we create this PR from directly inside Ace. Uh we give it a minute and it's going to show us the preview of the PR. And then we can click a link that goes to it over here. In a second, Edon's going to click. There we go. So, there's our PR. All works, right? This is backwards compatible. It has a link back to the Ace session within the description. Like people don't all have to be in Ace to use this. You could have a few members of your team in Ace and the rest stay on whatever else they're using. And sometimes, you know, you still need to touch code. Like I do a lot of front end and agents are [ __ ] at CSS. They never do what I want. So, we can of course open our project in VS Code here um and we have real-time multiplayer editing because again, this is just a micro VM cloud computer. Everyone's on the same computer. I can close my laptop on this and work can continue. Um my session doesn't die. My teammates can keep prompting Ace um and making progress. We don't have a mobile interface yet, but we're building it. But this micro VM architecture means that that will work seamlessly. Like I don't have to use my phone to somehow SSH into a terminal on my computer. Computer doesn't need to be alive and I don't need to go buy a Mac Mini to keep things available. I just talk to my always-on agent in the cloud. For bigger, more complex features, you'll of course want your agent to write a plan. That's a very standard workflow at this point. So, here we're chatting about adding uh variable time frames to our Hacker News clone app. And then I've gone ahead and asked Ace to make a plan, which he's going to do quite quickly. And so, we can go open that plan, right? And here we all are in our plan. I can see my teammates' cursors. We can collaboratively edit it together. We can decide if we like this plan, if it's any good at all, if it achieves our intent. Um my teammate Nate here is making suggestions about maybe using a drop-down for the interface instead of a segmented control. And then Edon's come in and updated the requirements so the agent knows to do that. Uh and once we're all happy with the details, we go back to the chat and we can just say at Ace, do this. And it knows what the context is. So, I'm now going to jump over to our dashboard in Ace. Uh a lot of the planning and discussion that would otherwise happen in Slack or GitHub or Linear is now happening in our Ace sessions. So, we have a lot of access to rich context on what work is underway and can helpfully summarize it for you. So, here it's Monday morning and I've been trying to remember what I left unfinished last Friday. And Ace is prompting me to keep working on some React hooks I was making as part of a big refactor, which is helpful since I have very crappy human memory after a long weekend. And from here I can start a new session or in this pick back up section, I can one click I can open the session to keep going on my unmerged PR. I can also see a list of my recently completed PRs and issues to stroke my ego and make me feel productive. And on the right here we have a team whole section. So, this summarizes what my co-workers have been up to for the last couple of days. I can see Nate has been shipping a lobby channel and David has been fixing access token issues. Um there's also a raw feed of recent issues and PRs on this repo, but I personally find the summary much more helpful. Um one of the biggest challenges of agentic development is that the speed and volume of work makes it really hard to keep up with what your co-workers are doing. They are now shipping five features a day instead of half of one. This dashboard is our first pass at trying to make agents proactive and bringing that social context to you. If all your conversations around the code are available to agents, it gives them access to a social information fabric where they can help get you oriented every morning and stay aligned with your team. They could notify you about decisions being made or pull you into a conversation where someone is about to extend a feature that you originally built. So, this is no longer a bunch of solo disconnected terminal instances on individual computers. This becomes a living, intelligent environment where everyone shares the same workspace and context. So, all of this is actually about reclaiming time, right? Before coding agents came along, none of us had enough time and energy to make our products the way we wanted to. I guarantee everyone in this room has shipped software they're not proud of. Maybe you didn't have enough time to do user research or consider design details or think through the implications of your architecture choices. Um not because you didn't want to, but because there simply wasn't enough time because implementation took up so much of that time and effort. But we've been gifted a lot of that time back. We have an opportunity to not just go faster and build a giant pile of the same crappy software, but instead to make much better software through much more rigorous critical thinking and better alignment in the planning stage. By doing more exploration, more research, and thinking through problems more deeply than we could have before. Agents allow us to scale up ourselves and our teams in a way that if done right, should lead to better quality software. I think many people are now realizing that in a world of fast, cheap software, quality becomes the new differentiator. The bar is being set much higher and craftsmanship is what set you will will will set you apart from vibe-coded slop. Um but craft still costs time and energy. It's not free. And in order to buy the time and energy you need for it, you need to do fewer things better, which requires lots of strong alignment. There are also more distractions than ever. It's very easy to prompt your way to the wrong thing or to add lots of unnecessarily unhelpful features to your product. I think the dream for me is that we end up with tools, whether it's Ace or others, that create environments where teams can think rigorously together about hard problems. Uh agentic tools should help us do higher quality work, get aligned faster, and build a few exceptional things rather than a thousand crappy ones. Thank you very much for listening. Um >> [applause] >> Uh if you do want early access to Ace, we should have it out within a couple months at the very latest. Uh this QR code will take you to a form where you put in your GitHub username and then it will give you early access as soon as it comes out. Um you can read more about the GitHub Next team and their research on githubnext.com and all my work and writing is on maggieappleton.com and I'll have the slides and notes for this up there in a day or two. Thanks. >> [music] [music]