Proactive Agents – Kath Korevec, Google Labs

Channel: aiDotEngineer
Published at: 2025-12-13
YouTube video id: v3u8xc0zLec
Source: https://www.youtube.com/watch?v=v3u8xc0zLec
[music]
>> I'm so excited to be here. I love New
York and I love meeting everybody here.
And I am Kath Gorbet. I'm from Google
Labs and I work on this little team
called Ada and I'm going to be talking
about some of the stuff that we've been
doing on this project called Jewels.
So, a few months ago in my household our
dishwasher broke. And while it was being
repaired my husband decided that he was
going to do all the dishes. And so he
told me he was going to do this, but
every single night I found myself
reminding him to do the dishes. And you
can imagine that got old pretty fast.
And I realized that even though I wasn't
physically washing the dishes, I was
still carrying this mental load. I know
a lot of you can probably relate to
this. I was keeping track of whether or
not that task was done, following up,
making sure that things kept moving.
And I realized in that moment that
that's exactly where we are with
asynchronous agents today. They can
handle some of the work, but we're still
the ones as developers carrying that
mental load and monitoring them.
So, here's the truth. Humans, we are
serial processors, not parallel ones.
We can juggle multiple goals, but we
execute them in sequence, not all at
once. When you manually kick off a task
in Jewels, you are usually waiting to be
able to move on. And it's that pause,
it's that gap in attention where we
really lose momentum. And this is
actually backed up by science where
humans actually think we think we're
multitaskers, but we're actually
executing many tasks very rapidly. But
switching between these tasks comes with
a huge cost. It can cost up to 40% of
your productive time. So, that's like
half a day lost to switching contexts
and reloading. So, if humans are
unitaskers, what's the solution here
with agents? So, for async agents in
order in order for them to succeed,
developers can't be expected to babysit
them.
We've all seen that post on Twitter of
16 different Claude code tasks running
in parallel on 16 different terminals on
three different huge browser or a huge
monitors. And when I first saw this I
thought God forbid that is the DevEx of
the future.
I want to I don't want to manage work. I
don't want to manage my agents. I want
to be a coder. I want to build. And so,
we need to think we need
collaborators in our system that we can
trust. Agents that really understand
context, can anticipate our needs, and
they know really when to step in.
And then
I think finally we're reaching that
point with models where they're getting
better and better at executing
end-to-end as long as they understand
what our goals are clearly. And that's
where trust really becomes this unlock.
Where you can trust the system to know
what's missing, to fill in the gaps, and
to really keep progress moving forward
while you manage on something else or
where while you focus on what matters
most. And essentially, we want Jewels to
do the dishes without being asked.
So, most AI developer tools today are
fundamentally reactive. You open up your
CLI or your IDE and you ask the agent to
do something and it responds. Or it
waits for you to start typing and then
it auto-completes a suggestion. And
there's a benefit to this model. It's
very efficient. It only uses compute
when you explicitly ask for it. But the
real question I'm asking myself is is
this how I want to manage AI? And if you
think about in the future, imagine a
world where compute is not a limiting
factor anymore.
Instead of a single reactive assistant
for instructions, you could have dozens
of small proactive agents working with
you in parallel, quietly looking for
patterns, noticing friction, and taking
on the boring tasks that you don't want
to do before you even ask.
It can do things like fixing
authentication bugs that you've been
avoiding, updating configs, flagging
potential order errors, preparing
migrations, and all of this can happen
in the background triggered off of
things in my natural workflow.
So, I really think there are four
essential ingredients that make up
proactive systems today. There's
observation. The agent has to really
continually understand what is happening
of what your code changes are, what your
patterns are, what your workflow is,
etc. to get context about your entire
project. And then there's
personalization and this one's
difficult. It has to learn how you work,
what you care about, what you tend to
ignore, what your preferences are, the
code that you absolutely don't want to
ever touch.
And then it has to be timely as well. If
it comes in too soon, it's going to
interrupt you and if it's too late, then
the moment is lost. And it also has to
work seamlessly across your workflow. It
has to insert itself into spaces where
you naturally work already in your
terminal, in your repository, in your
IDE, not forcing you to go somewhere
else to some application that's secret
or that you forgot about. So, bringing
all these tools together, you can
imagine is not trivial.
>> [laughter]
>> So, I was running this presentation.
Um and
you want to be able to ask your agent to
understand your workflow and anticipate
your needs and then intervene at exactly
the right moment without breaking your
workflow.
And that's when it really starts to feel
like magic. The interesting thing is
these proactive systems are all around
us today. One of my favorite examples is
Google Nest where you put it in your
house, you install it, and then you
configure it, and then it starts to
learn your habits as you leave the
house, as you come back,
as you go to sleep, as you wake up in
the morning. And then pretty soon, you
don't have to think about climate
control in your house anymore because
it's learned what your habits are.
Another one is your own body. Your heart
rate elevates as you go for a run or
start to work out. Or it anticipates
that you're about to fall and so it
reacts before you consciously think I'm
going to put my hand out.
So, when you look at it like that,
proactivity is actually not that
proactivity for AI is actually not that
futuristic. It's very familiar and it is
very human.
And that's exactly the point. What we're
building is tools that behave more like
a good collaborator and less like
command line utilities.
So, we're already doing this in this
tool called Jewels which is this
proactive asynchronous autonomous coding
agent from Google Labs.
And we're doing this in kind of three
levels of
proactivity. Level one is where a
collaboration really starts to emerge
and this is how Jewels works today.
Where it can detect things like missing
tests, unused dependencies, unsafe
patterns, and then it starts to
automatically fix those things as it's
doing other other tasks that you've
asked it to do.
This is sort of like this attentive sous
chef in your workflow where it's keeping
the kitchen clean, the knives sharp, the
kitchen stocked so that you can focus on
what comes next. And that's the
beginning of proactive software. At
level two, the agent becomes more
contextually aware of the entire
project. It observes how you work, the
code you write. If you're a back-end
engineer, maybe you need help with
React. If you're a designer, maybe it
wants you to maybe it'll help uh
uh write the database schema. And then
it learns what your frameworks are and
what your deployment style is, etc. And
this is the kitchen manager. This is the
person in your workflow keeping the
rhythm and anticipating what you need
next.
And then comes level three and this is
what we're working on pretty hard right
now going into December. And I'll show
you a little bit of what we're what
we're going to be shipping in December
in a minute, but level three is where
things start to converge around that
context.
It's where the agent starts to
understand not just context, but also
consequence. How these choices are
actually affecting the users of your
products, the performance, and the
outcomes. And at that level, we have
this thing Jewels. We also have an agent
called Stitch which is a design agent
and another one we're building called
Insights which is a data agent and
they're all coming together to build
this collective intelligence across your
application. Jewels can see what's
breaking in the software, Stitch
understands how users are interacting
with it, and Insights connects behaviors
from real world signals like analytics,
telemetry, and conversion rates. And
then together, they can propose
improvements across boundaries of how
the system all works together doing
things like performance fixes to improve
UX and then design changes to prevent
regressions. And then all of that is
organized based on live data.
So, the trick here is that the human
stays firmly in the loop. You're
observing what the agents are doing,
you're refining when you when they when
you need to intervene, and then you're
redirecting it when it has when it has
been misdirected. So, level three isn't
really about autonomy anymore. It's
actually about alignment to your
project. Aim agents and humans
collaborating together across the full
life cycle of your project.
So, right now Jewels is focused on this
code awareness piece. It understands the
environment, the frameworks, and the
project structures. And we're moving
towards more of that system awareness.
So, things that we're introducing in
Jewels now, we've added something called
memory which I'm sure a lot of you are
familiar with. It's the ability for
Jewels to write its own memories and you
can edit them and interact with them. It
can edit them and it understands what
and builds this memory and context and
knowledge of of your project as you work
with it. We've added a critic agent
which works adversarially with Jewels to
make sure that the code is is high
quality, but then also does a full code
review. And then we've added
verification where Jules will write a
playwright script, take a screenshot,
and then put that back into the
trajectory for you to validate. And then
we're also doing things like adding
a to-do bot that will look through your
code and look through your repository
and pick up on anything that you where
you said this is a to-do I want to get
to in the future and it will start to
proactively work on those things with
that context. We're also adding in
things like best practices where Jules
will understand best practices and start
to suggest those and also environment
setup. We have an environment agent that
we use internally for running evals and
we're extending that externally to
better understand how environment how
your environments work and set those up
for you. And then we also are adding
something called a just-in-time context.
It's like a Jules cheat sheet where if
it's doing something very specific it
can and get stuck it can just
immediately look at that cheat sheet
instead of reaching out to you.
So this is all moving Jules very close
to being that proactive teammate, not
just this reactive assistant. Okay, so
this morning I was talking to my team
back in San Francisco and I was
thinking, "Okay, I'm going to do a live
demo." But the live demo gods did not
align with me this morning. We still
have CLs that are being pushed to
staging right now. So I'm going to walk
you through a little bit of this and if
you know Jud he's going to I think be
talking tomorrow. We're going to
affectionately try to fix Jud's code
here. Um so this is
a view of proactivity and this is this
is Jules where you prompt it. And the
first thing you that you do when you
configure and enable proactivity is
Jules will index your entire
code base. It'll index your directory
and start looking for things that it can
do. And then it'll that'll show up on
the screen. So right here we're looking
at a little bit more in this in this
repository ADK Python and
and it's indexed the repository and it's
found a bunch of to-dos. It's found a
bunch of best practices that it can
update and it's giving me some signal
about what it's finding. And so you can
see the signal is high confidence,
medium confidence, and low. And so it's
actually telling me what it thinks it
can achieve based on what's in my code
and what it wants to do.
And that's so it has high confidence in
green, medium in purple, low in yellow
way down at the bottom.
>> [clears throat]
>> And so I can go through this and I can
manually click these and say I want to
start these and so I don't have to think
about the prompt. I don't have to look
at the code. I don't I I can do kind of
less cognitive load here.
We're working on something to just start
these automatically. And so that's
coming in the future, but I can also
delete these. I can say, "Hey, this one
isn't isn't for me. Isn't good."
And so once it gets started on a task I
can kind of drill into it and see a
little bit more. I can peek into the
code that it is suggesting
that
it's suggesting it work on. I can find
the location of that code and it also
gives me some rationale about why it
wants to work on that code, why what
it's doing, etc. And so it's giving me a
lot more context and helping me trust
that it knows what to do here.
Okay, so that's proactivity. That's
coming in December and hopefully we'll
be able to give that to everybody here.
We're very excited about it and I want
to tell you a little story about
something my husband and I were working
on just to kind of set set wrap things
up. We tinker a bunch with hardware and
we live on this slow street in the
middle of San Francisco in
Haight-Ashbury district. And so on
Halloween we get a lot of people walking
by our house and so we were trying to
take advantage of that with our
Halloween decorations.
And so we built this 6-ft animatronic
head that sits in the front of our
house. It's this old Victorian house.
And he sculpted it out of foam, epoxy,
and fiberglass. And then I our our kids
also called this lovingly the bald head.
And it's based off of if you ever see
Pee-wee Herman from the '80s it's based
off of the Pee-wee Herman Pee-wee's Big
Adventure's head.
So while my husband was doing this I was
spending my time working with Jules on
updating the firmware, controlling the
stepper motors, working on the on the
LEDs and the sensors.
And for me that's the fun part for me is
like really getting creative with what
the LEDs are doing. So I wanted to focus
on that, the LED animations. But I ended
up spending most of my time actually
fixing bugs and swapping libraries and
doing things like that. So what I would
do is I would prompt Jules, I'd wait 10
minutes, and then I would repeat. And I
found that process very very tedious.
And what I wanted was actually Jules to
do the research. I wanted it to handle
the the ugly parts where it was
researching how to fix a bug,
doing the debugging itself. And I wanted
it to do this so that I could focus on
the creative parts. I wanted the eyes to
move and like follow people as they
walked down the street and like have
lasers coming out of its eyes and stuff.
I mentioned it was Halloween. It was
very scary.
And and this but but I couldn't really
do as much of that and I ended up
actually not shipping as much as I
wanted to with this animatronic bald
head.
And so it's that gap that we actually
want to close. It's the space between
with Jules. It's the space between that
tool friction and creative freedom that
we're trying to unlock with these kinds
of proactive agents.
So what I really want
you guys to take away from it. I give
this advice to the folks on on the Jules
team a lot is that the product we build
today actually won't be the project the
products that we have in the future. And
I think a lot of us know that, but in
reality I want everybody in this room
and everyone building working with AI to
be able to take those big steps. I think
the patterns that we rely on today get
your your IDEs, even the code, how we
think about the code itself might not
exist a year from now. Might not exist 6
months from now.
And that's the exciting part for me.
It's sort of we get to invent the future
right now. We get to describe and decide
how software is made and built kind of
all the people in this room. So
my my challenge to you is to not be
afraid to question the old ways of how
you're building software cuz really the
future is coming faster than any of us
know. It's probably already here and the
cool thing is we get to build it
together.
Thank you.
>> [music]
[music]
[music]