Moving away from Agile: What's Next – Martin Harrysson & Natasha Maniar, McKinsey & Company

Channel: aiDotEngineer
Published at: 2025-12-12
YouTube video id: SZStlIhyTCY
Source: https://www.youtube.com/watch?v=SZStlIhyTCY
[music]
>> Good morning. Hello everyone. It's
really great to be here.
So I'm Martin and I'm here with my
colleague Natasha.
We're from a part of McKinsey you may
may not be as familiar with. We have a
practice called software X and we work
with mostly enterprise clients on how to
build better software products which has
meant mostly using AI in the in the past
couple of years.
And so what our talk is about today is
really more focused on the people and
the operating model aspects of
leveraging AI for software development
and and that we believe that that has to
change quite significantly and and
that's what we're excited to talk to you
about.
If I take a quick step back in in time
and we just
you know think through some of these the
major technology breakthroughs that
we've seen in the last few decades
they tend to always come with a paradigm
shift and also how we develop software.
And so I still recall almost 20 years
ago now I started working as a software
engineer entry level developer
in a tech company.
And the company I was working for was
just switching to to Agile. We were
using Kanban boards, we were doing
stand-ups and and other ceremonies.
This was a big change. It was a massive
change for the for the company.
And now with everything that is
happening happening in AI we're at the
precipice of another such paradigm
shift.
And
um
if we think about some of the
some of the things that are happening
with AI and software development that
we've seen at this
at this conference there's no doubt that
this is a new paradigm that is about us.
And so we'll talk about two things.
We'll first touch a little bit about how
do you go from these things that we're
seeing at individual productivity to
scaling that to the whole team and what
that what type of changes we think that
implies. And then we'll talk a little
bit
about how do you scale that across whole
organization and to really get get
value.
Um
if if you sort of
talking to an audience here which is
using AI agents all the time and I
thought if I if I asked you about some
examples I'm sure you could rattle off
you know 10 different ones where you
would say look there was this thing that
I used to do it used to take
maybe even days and and and hours that
are now taking only minutes right.
There's no shortage of those those
stories and you can go over to the expo
and and talk to any of the companies
there about all these all these great
use cases. It really shows that these
tools work and they can be really
impactful.
And so yet despite seeing you know some
of these improvement
improvements
we've done some research to gauge you
know where are our clients at the
moment. We we recently surveyed about
300 companies
mostly enterprises around what are they
seeing in terms of productivity
improvements. So you have this and then
they would say on average we're often
seeing only 5 10 15% improvements
overall as as a company.
So we're in a place where there's a bit
of a disconnect between this this big
potential around AI as
from the reality.
And so
we we think that
there is this gap because as we've
started implementing AI whether it's
you know coding assistance or whether
it's now using you know you just heard
about
you know how AI is using agents and more
complex workflows.
What has started to emerge is a is a set
of bottlenecks
that that were not necessarily there
before.
Like for for example
as we now start moving much faster in
certain in certain aspects of work
we haven't really changed how we
collaborate among people and and team
members it's not quite keeping up.
We started generating way more more code
but we're it's still being reviewed in a
in a pretty manual way in in many
companies.
Then we also have this this theme which
was recently highlighted in in even a
research report from Carnegie Mellon
about how all the new code that is being
generated is also amplifying the
generation of tech debt in some in some
cases and actually generating
complexity.
And so
there are these bottlenecks they're not
impossible to overcome but this is what
we believe is limiting many companies
from seeing the
the real value that that they should be
seeing.
Let me talk about maybe just a couple of
examples to to make that
come to life a a little bit more.
One of the things that we see as a big
rate limiter at the moment is around how
work is allocated and so what what we've
learned over the last couple of years is
that the impact from AI and agents is
highly uneven. There are some tasks
which which works amazingly well today
and you see
huge improvements and there are others
where it's not as effective. As you have
that variability you also have
variability among people. Some have have
lots of experience now using these tools
and and know how to pick that up and
others
are less experienced right. And so what
that means for for team leaders for
engineering managers and so on is it's
very highly non-trivial to know how to
allocate work and resources in in a good
way and this is creating a lot of
inefficiencies.
Another example
is is around how work is being reviewed.
So agents are often giving given pretty
fussy you know stories that are written
in prose with pretty fussy acceptance
criteria which which means that the code
that comes back is not always what it
was intended to be.
And and for many companies the only
mechanism to control that is is often
manual review. So you've you've
automated some things but we've
generated more manual reviews. So these
are some of the some of the examples of
these bottleneck that we that we see
coming up.
And as mentioned what what has that has
resulted in so far is that
most most large companies today
are are stuck a little bit in in a world
of relatively marginal gains.
They're working in ways that was
developed with constraints that we had
in the past paradigm of human
development. So you have you you know if
you go out to most companies you see
eight to 10 person teams you see working
in two week sprints you have all these
these elements that were largely parts
of like an of an Agile operating
model.
And that is and that is putting in some
some limits to what they can see.
Over the past year we've been working
with lots of clients to to sort of break
that model a bit and develop new ways of
of working in smaller teams in new roles
in in shorter cycles and when you do
that we see really great performance
improvements and that's what gives you
it gives us this path where we see
things are going to improve.
So we realized that re-wiring the PDLC
is not just a one-size-fits-all
solution. For example different types of
engineering functions across the
enterprise along the product life cycle
may require different operating models
based on how humans and agents best
collaborate. So if we take the example
of modernizing legacy code bases this
task requires a high context of
potentially the entire code base but
also has clearly well-defined outputs.
So an example operating model could look
like a factory of agents where humans
provide an initial spec and final review
with minimal intervention.
For new features for greenfield and
brownfield projects the operating model
may look like an iterative loop because
they may benefit from the
non-deterministic outputs and increased
variation. Where agents act as
co-creators
providing more options to facilitate
faster feedback loops.
So as we mentioned we did a survey among
300 enterprises globally to understand
what sets these top performers apart. We
found that they are seven times more
likely to have AI native workflows which
meant scaling over four use cases across
the software development life cycle
rather than just having point solutions
for just code review or for just code
dev.
They were also six times more likely to
have AI native roles which meant having
smaller pods with different skill sets
and new roles.
To enable these shifts these
organizations were investing in
continuous and hands-on upskilling
impact measurement and also incentive
structures to incentivize developers and
PMs to adopt AI.
This led to five to six times increase
in time to market and delivery speed as
well as higher quality and more
consistent artifacts.
So when we talk about AI native
workflows we mean that these enterprises
are moving away from quarterly planning
to continuous planning and also the unit
of work is moving from story driven to
spec driven development so that these
PMs are iterating on these specs with
agents rather than iterating on long
PRDs.
On the talent side AI native roles
essentially means that we're moving away
from the two-pizza structure to
one-pizza pods of three to five
individuals. Instead of having separate
QA, front-end, and back-end engineers,
there are more consolidated roles where
product builders are managing and
orchestrating agents with full-stack
fluency and also a better understanding
of the full architecture of their code
base.
PMs are starting to create direct
prototypes in code rather than iterating
on these long PRDs.
And one example that we've described in
our article, we've studied some AI
native startups and realized that
they've actually implemented all of
these shifts to accelerate their
outcomes. And in our article, we've
described how Cursor actually operates
internally.
But if you're a large enterprise
predicated on the Agile model, what are
some steps you can take? So in in a
recent client study with a leading
international bank, we tested some
team-level interventions to address the
bottlenecks previously mentioned before.
Mainly around the sequencing of steps
within the Agile ceremony and how to
define the roles of agents and humans
within the sprint cycle. So let's walk
through some examples.
First, team leads would assign sprint
stories using agents based on the data
of the team velocity and delivery
history. And then they would create
co-create multiple prototypes and
iterate with agents on the acceptance
criteria around security and
observability needs to have more
consistent artifacts across teams.
This prevents downstream rework that was
mentioned before so that developers
don't have to constantly be iterating
with the agents during during the code
process.
The squads were also reorganized by
workflow, so there would be one which
would be focused on
small bug fixes and another focused on
greenfield development. In the
background, agents would be used to look
and impact look at
the
potential cross-repository impacts
to prevent debugging time for
developers.
And another example is that instead of
for reducing the collaboration overhead
and meetings that happen within the
sprint cycle, instead of waiting for
data scientist input, PMs would directly
be observing the real-time customer
feedback to reprioritize these features.
And this would lead to an acceleration
in the backlog within the same amount of
time.
So we studied the
impact of these interventions and found
high-promising results. For example, not
just the increase in agent consumption
by over 60 times, but there was also an
increase in the delivery speed that was
tied directly to the business priorities
for this bank. There was a 51% increase
in code mergers, but also a decrease in
an increase in efficiency.
The other aspect of this is is around
the different roles and the talent
model.
And so one of the biggest
differentiators that we saw, as
mentioned, was around whether you have
actually changed the roles that that are
involved in software development. And
so, you know, what what you all are
seeing is that engineers are moving away
from execution and and just simply
writing code to being more of
orchestrators and and thinking through
more how to divide up work to agents,
for example. And we also heard some
examples of how the role of the product
manager is changing. As a while this
this may sound, you know, pretty
straightforward to many of you here who
are who are working with these tools
like they today that you have to change
what you do, the reality is that about
70% of the companies that we that we
survey have have not changed the roles
at all. I mean, so you have this
background expectation that people are
going to do things differently, but the
role is still defined in the same way
and it's the same understanding
as it was, you know, a couple of years
ago.
Um but we are starting to see, you know,
some companies changing this. So this is
another example from a from another
recent recent client. They were set up
in a in a way that is, you know,
probably pretty common for
many companies and and a kind of typical
two-pizza uh team model with with the
types of roles that you'd be familiar
with.
Um
they we ran a bunch of experiments on on
frontrunners and and tested new models
that were had much smaller pods
that had
new roles which consolidated some of the
tasks that were previously done by
different roles.
And
and so by doing that, we could we could
create basically more pods or more teams
with with the same number of people. Uh
but retaining the expectation that each
pod is is
is
performing at about the same level as as
it were before.
And so so we also see really really
positive results from that
with with maintaining and even improving
in some case the quality of the code
that was generated. In particular, there
was a there was a high speed up in in
terms of
the output from from the different
teams. And you can see some of the
metrics
here.
Let's shift gears a little bit and and
and go from talking about just the team
level to how does this now scale
across a big organization?
The reality is that
many many companies don't just have like
one or two of these these teams, but
often hundreds of teams even and
thousands or even tens of thousands of
people who are working in in this way.
And uh this is where one of the biggest
differences that we that we saw between
those that are stuck with in the
in in getting only 10% or so change
improvements from those who are seeing
outsized improvements is around how you
manage that how you manage that change.
And change management I got is like one
of these a little bit of an you know,
often catch or elusive term for
for a lot of different things. But but I
think in some ways it's not a bad way to
think about it. I usually say that the
change management is about getting a lot
of like small things right. And so the
crux to like actually scaling this is
often about getting 20, 30, or even more
things right at the same time that
involve the way you communicate, uh what
this means, the way you incentivize
people,
the way you upskill them. And it all has
to come together.
Um and when it when it's not, we we we
see what happens. And so this is an
example from a from another tech company
that we worked with.
Um where initially we were rolling out
new AI tools for them that that hit
different parts of the product
development life cycle.
Um we we rolled we rolled out the tools.
There was some usage, but often it
dropped off. It was either not used or
it was
um it was sort of um
used in very suboptimal ways. So that's
the sort of jagged part that you're
seeing on the on the left-hand side
here. Despite kind of adding more users,
uh the overall impact did not change at
all. So we had to do a quite a reset.
And and um start over effectively. Reset
the expectations. What should what what
does this mean if you're a developer
day-to-day? What does it mean for a PM?
Uh we had much more hands-on upskilling.
There was could bring your own code.
There were, you know, coaches available.
Especially those first like few sprints
before you get make this a habit and
work it into the way that you develop
software day-to-day. It's a very
critical time. And that's when when this
matters a lot. Um
and having a bit of a a measurement
system as well, so you know what's
changing and and you're able to to see
what's
uh what's what's improving.
Another example just to but just
elaborate as mentioned that this is
about getting a lot of things
um right. And if each one of these
individually may not seem as it's the
biggest deal,
uh but put together, they really make a
make a huge difference.
Like this is for this is some of the top
interventions that another client had to
go through. For them, it really helped
having, you know, setting up code labs,
for example, really, you know,
instituting a new set of certifications
that helped motivate and and drive
people to to change what they do
day-to-day. And these these things
really added up to
the change they needed.
But building a robust measurement system
that prioritizes outcomes and not just
adoption is important not just to
monitor progress, but also pinpoint
issues and course-correct quickly. So
one surprising result from the survey
was that these enterprises that were
bottom performers were not even
measuring speed and only 10% were
measuring productivity.
Our goal is to make our clients
top-performing organizations. So we've
worked with them to create a holistic
measurement system that captures impact
all the way down to inputs. So for
inputs, this would include the
investment into coding tools and other
AI tools, but also the time and
resources in upskilling and change
management.
These inputs would lead to direct
outputs, but a lot of organizations are
just focusing on how the increased
breadth and depth of adoption with of AI
tools is leading to increased velocity
and capacity increase. However, it's
also important to understand how
developers have different NPS scores and
if they're enjoying their craft more
rather than feeling more frustrated. And
it's also important to understand
whether the code is becoming more secure
and have has better quality, but also
more resilient. And one proxy for
resiliency that we used for our client
was the mean time to resolve priority
bugs.
Now if we look at economic outcomes,
which is priority for um the C-suite
executives, they look into what is the
time to revenue target, what is the
increased price differential for higher
quality features, or expanding the
number of customers to meet the future
demand. And also, what is the cost
reduction per pod for reduced human
labor.
In aggregate,
having these larger economic outcomes
can also lead um to for organizations to
understand how there is an increased
reinvestment in greenfield and
brownfield development.
But as these tools evolve, the proxies
for these metrics will also evolve. But
hopefully, this provides a MEC framework
as an initial starting point.
So, what's next? The future, of course,
is difficult to predict, let alone in
the next 5 years. But we hope that with
our vision of a new software development
model, even as agents increase in their
intelligence and humans become more
fluent in AI, that this model still
stands.
So, hopefully, this model that includes
um shorter sprints, smaller teams, but
large uh smaller but larger number of
teams will set enterprises up for
success in the long term.
So, we just leave you with some some key
takeaways.
Um
start now, I would say to to our our
clients. This is a human change, and it
takes some times, and it's a big change,
and and it's going to be a journey. And
so, I think um is it something that
everyone needs to go on. I think it's
also important to figure out which model
works for you and set a really bold
ambition.
And with that, say thank you so much for
listening to us, and and uh we have an
article here if you're more interested
in in the research that we've conducted.
Thank you so much for having us.
>> [music]