Collaborative AI Engineering: One Dev, Two Dozen Agents, Zero Alignment — Maggie Appleton, GitHub

Channel: aiDotEngineer

Published at: 2026-04-26

YouTube video id: ClWD8OEYgp8

Source: https://www.youtube.com/watch?v=ClWD8OEYgp8

[music]
>> Okay, we all good? Right. Uh so yes,
this talk uh is called uh one developer,
two dozen agents, zero alignment. Uh
this is the case for why we need
collaborative AI engineering.
So first, a very quick intro. I'm
Maggie, I work uh at GitHub as a staff
research engineer. Uh at least that's my
title. I'm actually a designer back when
that was like a separate thing to
engineer. Um and next is the Labs team
within GitHub. So we work on kind of
more experimental, risky bets than the
rest of the organization. We like to
call it the Department of [ __ ] Around
and Find Out.
Um and like everyone else, we are of
course trying to shape new developer
agentic tools.
So, I think this is what many people
think peak developer productivity looks
like right now, right? This is like a
wall of terminal-based coding agents all
running in parallel on one person's
machine.
I like to call this the one man, two
dozen Clods theory of the future.
Uh so the promise that we're given here
is that one person with a fleet of
agents will do the work of an entire
team of developers.
The main problem with this stream is it
assumes that software is made by one
person.
All of these tools are single-player
interfaces and they focus on scaling up
the work of the individual. But there is
limited value in scaling up one
individual.
Because software is not made by one
person in a vacuum. It is a team sport
and everyone building it needs to agree
on what they're building and why.
Believing individual productivity leads
to great software is nine maybe nine
women make a baby in one month logic.
Uh more individual output doesn't solve
problems that require communication and
coordination. It makes them worse.
An implementation is rapidly becoming a
solved problem, right? Probably everyone
here believes that.
Uh writing code is now fast, it's
getting cheaper, and quality is going up
and to the right.
The hard question is no longer how to
build it, it's should we build it?
Agreeing on what to build is the new
bottleneck. So everyone on your team
needs to be involved in asking, are we
making the right thing? Are we spending
our energy in the right place and how do
we have the most impact?
When production is cheap, opportunity
cost becomes the real cost. You can't
build everything and whatever you pick
comes at the cost of everything else.
Anyone who ships software on a team
knows that this isn't a new problem.
Um alignment has always been a
bottleneck, but agents have made the
cost of not being aligned as a team
much, much higher.
What makes it worse is that all our
coordination tools are still from
another era.
So GitHub, Slack, Jira, Linear, and the
like are as they currently stand are not
designed for the agentic development
world.
We are funneling masses of agentic
outputs into platforms that were built
for an outdated way of building
software.
Um here I know like I work at GitHub, so
that might sound heretical for me to
say. Um but I promise it's not
controversial. There are very few people
internally who believe that the PR and
the issue are the future of software
development and there are lots of us
inside the machine trying to explore
what comes next.
So this is how the development process
used to look, right? We had a planning
phase, a building phase, and a review
phase. And we had all of these
touchpoints of alignment along the way.
And it was slow enough that we had time
for conversations in Slack and Zoom
meetings, comments on issues and draft
PRs so you could discuss the details.
And everyone could give their two cents
and get advice from expertise across
your team and seniors and catch mistakes
uh and course correct if things were
going wrong.
But by the time the code is reviewed and
merged, the whole team had seen the work
right happening and they were roughly on
the same page.
But that implementation window has now
collapsed.
And because implementation is no longer
as expensive and time-consuming, we
think we don't need to plan as much.
So most of those early touchpoints
actually disappear.
And we know the review time for
generated code is actually increased. So
that creates more points of alignment,
but they're actually on the wrong side
of the implementation.
The time between logging an issue and an
agent opening a PR is now a couple of
minutes. The code is so cheap that we
don't properly stop to think before we
prompt it.
Unhelpfully, most coding agents also
have this local plan mode that is
completely unshared with other people.
So you're not even your team on whether
the plan it made is good before you ship
it, if you even read it. And so we lose
even more alignment points.
This leaves the weight
of all that alignment to sit on the pull
request. All those checkpoints now come
after the implementation at the end of
the process when it's too late. And it's
never what PRs were really designed to
do in the first place, so they perform
poorly at it.
None of our current tools give teams a
shared space to discuss plans, gather
the right context, and work with agents
as a collective.
We're all experiencing the repercussions
of this. Going fast without good
alignment leads to wasted work. So this
is like features no one asked for and
that don't actually solve real problems.
And receiving critical feedback after
you finish something that ends up
meaning you have to toss the whole thing
out.
And also coordination debts. This is
when you get really hairy merge
conflicts because agents will touch in
the same files or developers even doing
duplicated work because they both picked
up a thing and tried to finish it in one
day.
Um or as we all know, we all have giant
stacks of PRs to review that nobody has
any context for and don't even know
what's in them.
So, how do we solve this?
We need tools that help everyone on the
team align before the agents start
working, not after.
That alignment needs to happen
constantly alongside the implementation.
Planning and building are no longer
separate phases, they are now a cycle.
The tools of the future need to bring
planning, context gathering, and
decision-making, and development
underneath one roof.
This is especially true because most of
the context that you need for alignment
and to build the right thing is not in
the code base. It's in people's heads.
The business context and the financial
resources determine what the correct
thing to build is, the political
dynamics of who's in charge and who gets
to make decisions, the product vision
from leaders, the user research insight
from designers, and the organization's
history and what you've built before.
These all matter immensely when you're
deciding what the right thing for your
team to build is. And the agents can
never discover this context on their
own. You need a way to get humans to
share it early and naturally without
adding process and overhead.
So all of this has been very clear to us
on the next team.
Um and we've been building a new
research prototype that explores how we
might solve some of these problems.
It's called ACE, stands for Agent
Collaboration Environment. Uh it's not a
primetime product yet. So like if it
looks pretty rough around the edges,
it's because it is.
Um we're about to go into technical
preview and we're going to use it to
test it with a few thousand people. Um
then we're going to learn how people
collaborate in it and iterate from
there.
So here we are in ACE. It probably looks
pretty familiar. We're not reinventing
any more wheels than we have to.
Uh it looks a bit like Slack, GitHub,
Copilot, and a bunch of cloud computers
had a baby.
So we have our sessions list here on the
left and sessions are where you do work,
right? It's a multiplayer chat, it's
like a Slack channel.
I have team teammates in here and I can
talk to them about the work we're doing,
but I also have my coding agents in
here.
Each session is more than a chat channel
though. It is also backed by a micro VM.
So a sandboxed computer in the cloud on
its own Git branch.
The changes we make in each session are
isolated, so we can work on parallel
tasks and instantly switch between them.
If I want to tap one of my teammates on
the shoulder and get their thoughts on a
feature I'm building, nobody has to
stash their Git changes and like pull
down a new branch or like wrestle with
local work trees. I just jump into their
session and I see what they're doing in
a click.
And this includes their entire prompting
history with the agent, so I have the
context about how they got to the
current outputs.
Just like a local machine, I can run
terminal commands in this session. Here
I'm going to run bun install and bun dev
to get my current project running.
I'm going to see in a minute uh my live
preview in the browser on the side is
going to pop up when I open the port. Um
the demo project we have here is a calm
version of Hacker News, so it only shows
you top three stories from the last uh
top stories from the last three months,
which is a bit more chill than every
day.
Um and I'm going to ask the agent to
change the color theme to purple here
and you'll see in a second it instantly
appears in my preview. Right, it's just
running the code.
Uh and the agent has also made an
automatic commit for me with a nice
commit message and I can open the diffs
and see the diffs, all kind of standard
things you would expect from code
agents.
So let's say we want to do some real
work. Uh I have my teammates here in
this session with me.
Uh and I'm going to ask ACE to add some
additional color themes to my app.
We're going to um pick which model we
want to use and obviously it's Opus 4.6.
And then ACE is going to get started.
Um we also have this handy summary block
in the top right-hand corner.
Uh this keeps me up to date with the
latest changes in this session, whether
they're from me or someone else, which
means I can switch between lots of
people's sessions that are running in
parallel and always stay oriented about
what's happening so that you don't get
overwhelmed with the amount of noise and
activity.
But the more important thing is I want
to talk to my teammates, right? I want
to discuss what changes we're making. So
I can ask them what they think of the
current changes. They can spin up the
dev server themselves because remember
we're all working on the same computer
in the cloud. This is no problem.
We can all see the same preview. We can
all write terminal commands and see the
shared outputs. No one is going to say
this doesn't work on my machine.
So my teammates, Nate and Dawn, they're
jumping in here. They've taken some
screenshots. They're suggesting some
alternative features, asking questions.
And now what we're about to see is that
Nate is going to ask the ACE agent to
make changes
in a minute.
Where is Nate? There's Nate. So, he said
Ace let's add a teal theme, too.
Um
so, I actually kicked off this session,
but Nate is now prompting the agent. So,
this is truly multiplayer. Both of us
are sharing this coding session.
Uh the agent can also read our whole
conversation. That is all input to the
prompt. So, we can talk about things up
ahead and just say at Ace, do it.
They'll go do it.
This kind of accessible Slack-like
interface means that um access to a
coding agent is um brings everyone in
who's creating software. So, not just
developers, but designers and PMs and
customer support people can all be in
the same conversation seeing what's
happening in real time as a feature gets
built.
Because if you're thinking, you know,
like why wouldn't we just use Slack for
this? I think it's because Slack is
never going to become a fully featured
software development tool unless they
sincerely pivot from their current
business. Um so, it's never going to
have the right primitives and I really
doubt it's going to add them. You know,
diffs and terminal commands and that
sort of thing is not Slack's business.
Um we wanted Ace because it's explicitly
designed for software development, but
it's much more welcoming to other team
members than your terminal.
Anyway, we're back to shipping our
changes here. We like how this looks, so
we're going to create a PR.
Um because eventually all this code does
have to go back to GitHub, right?
So, we create this PR from directly
inside Ace.
Uh we give it a minute and it's going to
show us the preview of the PR.
And then we can click a link that goes
to it over here.
In a second, Edon's going to click.
There we go. So, there's our PR. All
works, right? This is backwards
compatible. It has a link back to the
Ace session within the description. Like
people don't all have to be in Ace to
use this. You could have a few members
of your team in Ace and the rest stay on
whatever else they're using.
And sometimes, you know, you still need
to touch code. Like I do a lot of front
end and agents are [ __ ] at CSS. They
never do what I want. So, we can of
course open our project in VS Code here
um and we have real-time multiplayer
editing because again, this is just a
micro VM cloud computer. Everyone's on
the same computer.
I can close my laptop on this and work
can continue. Um my session doesn't die.
My teammates can keep prompting Ace um
and making progress.
We don't have a mobile interface yet,
but we're building it. But this micro VM
architecture means that that will work
seamlessly. Like I don't have to use my
phone to somehow SSH into a terminal on
my computer.
Computer doesn't need to be alive and I
don't need to go buy a Mac Mini to keep
things available. I just talk to my
always-on agent in the cloud.
For bigger, more complex features,
you'll of course want your agent to
write a plan. That's a very standard
workflow at this point. So, here we're
chatting about adding uh variable time
frames to our Hacker News clone app. And
then I've gone ahead and asked Ace to
make a plan, which he's going to do
quite quickly.
And so, we can go open that plan, right?
And here we all are in our plan. I can
see my teammates' cursors. We can
collaboratively edit it together. We can
decide if we like this plan, if it's any
good at all, if it achieves our intent.
Um my teammate Nate here is making
suggestions about maybe using a
drop-down for the interface instead of a
segmented control. And then Edon's come
in and updated the requirements so the
agent knows to do that.
Uh and once we're all happy with the
details, we go back to the chat and we
can just say at Ace, do this. And it
knows what the context is.
So, I'm now going to jump over to our
dashboard in Ace.
Uh a lot of the planning and discussion
that would otherwise happen in Slack or
GitHub or Linear is now happening in our
Ace sessions. So, we have a lot of
access to rich context on what work is
underway and can helpfully summarize it
for you.
So, here it's Monday morning and I've
been trying to remember what I left
unfinished last Friday.
And Ace is prompting me to keep working
on some React hooks I was making as part
of a big refactor, which is helpful
since I have very crappy human memory
after a long weekend.
And from here I can start a new session
or in this pick back up section, I can
one click I can open the session to keep
going on my unmerged PR.
I can also see a list of my recently
completed PRs and issues to stroke my
ego and make me feel productive.
And on the right here we have a team
whole section. So, this summarizes what
my co-workers have been up to for the
last couple of days. I can see Nate has
been shipping a lobby channel and David
has been fixing access token issues.
Um there's also a raw feed of recent
issues and PRs on this repo, but I
personally find the summary much more
helpful.
Um one of the biggest challenges of
agentic development is that the speed
and volume of work makes it really hard
to keep up with what your co-workers are
doing.
They are now shipping five features a
day instead of half of one.
This dashboard is our first pass at
trying to make agents proactive and
bringing that social context to you.
If all your conversations around the
code are available to agents, it gives
them access to a social information
fabric where they can help get you
oriented every morning and stay aligned
with your team.
They could notify you about decisions
being made or pull you into a
conversation where someone is about to
extend a feature that you originally
built.
So, this is no longer a bunch of solo
disconnected terminal instances on
individual computers. This becomes a
living, intelligent environment where
everyone shares the same workspace and
context.
So, all of this is actually about
reclaiming time, right? Before coding
agents came along, none of us had enough
time and energy to make our products the
way we wanted to.
I guarantee everyone in this room has
shipped software they're not proud of.
Maybe you didn't have enough time to do
user research or consider design details
or think through the implications of
your architecture choices. Um not
because you didn't want to, but because
there simply wasn't enough time because
implementation took up so much of that
time and effort.
But we've been gifted a lot of that time
back. We have an opportunity to not just
go faster and build a giant pile of the
same crappy software,
but instead to make much better software
through much more rigorous critical
thinking and better alignment in the
planning stage.
By doing more exploration, more
research, and thinking through problems
more deeply than we could have before.
Agents allow us to scale up ourselves
and our teams in a way that if done
right, should lead to better quality
software.
I think many people are now realizing
that in a world of fast, cheap software,
quality becomes the new differentiator.
The bar is being set much higher and
craftsmanship is what set you will will
will set you apart from vibe-coded slop.
Um but craft still costs time and
energy. It's not free. And in order to
buy the time and energy you need for it,
you need to do fewer things better,
which requires lots of strong alignment.
There are also more distractions than
ever. It's very easy to prompt your way
to the wrong thing or to add lots of
unnecessarily unhelpful features to your
product.
I think the dream for me is that we end
up with tools, whether it's Ace or
others, that create environments where
teams can think rigorously together
about hard problems.
Uh agentic tools should help us do
higher quality work, get aligned faster,
and build a few exceptional things
rather than a thousand crappy ones.
Thank you very much for listening. Um
>> [applause]
>> Uh if you do want early access to Ace,
we should have it out within a couple
months at the very latest. Uh this QR
code will take you to a form where you
put in your GitHub username and then it
will give you early access as soon as it
comes out. Um you can read more about
the GitHub Next team and their research
on githubnext.com and all my work and
writing is on maggieappleton.com and
I'll have the slides and notes for this
up there in a day or two.
Thanks.
>> [music]
[music]