Hacking Subagents Into Codex CLI — Brian John, Betterup

Channel: aiDotEngineer

Published at: 2025-11-24

YouTube video id: 5eJqXtevlXg

Source: https://www.youtube.com/watch?v=5eJqXtevlXg

Hi everybody, my name is Brian John and
I'm excited today to talk to you about
hacking sub agents in the codeex CLI. So
who am I? I'm a principal fullstack
engineer. My current focus at work is AI
enablement for R&D. So think helping our
R&D team members get their work done
faster and with higher quality using AI.
The company I work for is BetterUp. It's
an awesome place to work. We've been
using AI since the very beginning. I've
been there for over eight years now,
which is longer than any place I've ever
worked before. And our mission is to
help people everywhere live their lives
with better purpose, clarity, and
passion. If that sounds interesting to
you, you work want to work on cool stuff
with LLMs,
please hit me up. I'll add my contact
info in the last slide.
So, why would we want to hack sub agents
into Codeex CLI?
Well, I've been using Clog Code as my
daily driver since the very beginning.
It's a great tool. It's got tons of
bells and whistles. It's got great
models,
and I use sub agents all the time. But I
don't want to be locked in to one tool,
and I really don't want to be locked in
to one model family.
I wanted to be able to use other tools,
particularly codec CLI, because the
models look really good and I want to be
able to still use sub aents with them so
that I can use my workflows
with other tools.
Context management. So, as you all know,
sub agents are amazing for context
management. The main agent can give a
problem to a sub agent. It can go off,
do its work, use its tokens, and pass
just the answer back to the main agent.
And all that context got used up by the
sub agent doesn't end up in your main
context window, which is incredible.
And I don't think I have to say any more
about this one. We've all seen this way
too many times and it gets annoying.
And I have to give credit where credit
is due. This talk by Dex Hory changed
the way that I work with AI.
The workflows he proposes here I found
to be really effective especially in
working with large code bases. I'd
recommend you check out this talk. He's
also talking at AI engineer code this
year and I recommend you check out that
one too because I'm sure it's going to
be great.
All right, so let's talk about design.
At the end of the day, a sub aent is
really simple. It's just another
instance of the main agent. So our
design can also be really simple.
In this case, we're going to have our
parent codec session.
We're going to have it run a script.
It's just going to be a wrapper script
that's going to kind of take care of
like figuring out what agent to run.
It's going to build the prompt, etc.
It's going to kick off codeex exec. So
that child codeex is going to run as the
sub aent. It's going to respond to the
prompt. It's going to do its work and
it's going to write its answer into a
file and then our wrapper script is
going to read that file and it's going
to print that result to standard out and
give it back to the parent codec
session.
Pretty straightforward.
Well, this is simple, so it should be
easy, right? Well, that's what I thought
too. And I started to get all these
errors from Codeex when I tried it.
Codex's sandbox really seems to not want
to let you do this. Now, you can of
course run it with dangerously skip
permissions or whatever. I don't do
that,
but to get it to work with the normal
set of permissions actually was really,
really hard and I bang my head against
the wall a long time trying to get this
to work.
So, figuring out the minimum required
permissions is probably the hardest part
about this. getting the combination just
right. On the parent, you need at least
sandbox of workspace, right, to be able
to run the codeex command. You can
always run that dangerously whatever
whatever command if you want. Again, I
don't really do that. The child process
is a little bit trickier. The sandbox
prevents its access to the OpenAI
credentials in your home directory since
it's outside of the workspace.
the you need at least sandbox workspace
write again so that it can write the
file that the uh wrapper script is going
to read and you need to disable this
thing called the rollout recorder
which is like a logging thing the just
because the parent sandbox again it
prevents file system access to any
subcomands
that are outside of the workspace
All right, before we go any further, I
have to give a quick note about
security.
Meta recently wrote a great paper called
the agents rule of two that I think
explains this really, really well. And
what it says is there's three things you
need to care about with your agent when
it comes to security. whether it's
processing untrustworthy input, whether
it has access to sensitive systems or
private data, and whether it can change
state or communicate externally.
In our case, we're not processing
untrustworthy inputs.
We do have access to sensitive systems
or private data because we're probably
working with a proprietary codebase.
And it can change state and it also can
can communicate externally. Now the
state that it can change is really kind
of dependent on your system. In my case,
it's really not very high risk and the
communication it does externally is just
to OpenAI's API endpoint. So again,
not a major risk, I would say. So that
puts us in the lower risk category. But
importantly,
lower risk does not mean no risk. So
your mileage may vary here. you need to
make your own determination on if this
is something you feel comfortable with.
With that, let's move forward.
All right. So, to get codeex to be able
to use sub agents with this wrapper
script and everything, we have to tell
it how to run them. So in our agents MD
we're going to have just a little bit of
information here that tells codeex hey
when I say use the whatever sub agent go
and actually like run this script and
you know with these commands or whatever
and that's how you do it.
Also we have to tell it when to run sub
aents. So that would be you know when
the user asks or just when you think
helpful. Then we want to tell it what
sub aents are available and what they
do.
All right, with that, let's do a quick
demo.
I've put together a really quick and
small proof of concept repository. It's
open source. You can go and take a look
at it yourself. I'll have the URL at the
end of the talk. Let's just take a look
at what's in here.
So, first of all, let's take a look at
our agents.
I just created a couple of toy agents
here. Let's go take a look at them how
they're defined.
You can see here each agent has a name.
It also has a reasoning effort. So,
depending on what kind of work it's
doing, you can give it a light, medium,
you can give it a high reasoning effort,
whatever you think is appropriate. Then
you just give it, you know, the prompt
for the agent. So very similar to kind
of how claude code sub aents work. In
this case, it's just counting words. You
know, this other one is a file writer
agent. Just going to take some text and
put it in a file. Don't need much
reasoning for that.
All right. So now let's look at our
wrapper script.
It's really small, only 72 lines.
basically just takes in the inputs.
It's going to call this agent exeutor
Python class, which I'll show in just a
minute. Also very small, and it's going
to return that uh the agent's output to
standard out so that the main agent can
see it. Let's look at that agent
executive class.
Not going to go through this whole
thing. Again, it's pretty small.
basically just kicks off the child sub
agent with the proper permissions and
with the right reasoning effort
and it disables the rollout recorder all
that kind of stuff just does all that
for you. So pretty handy. One thing that
I think I didn't cover you look at
agents MD
is it's kind of important here is this
part. So when we're telling Codeex how
to invoke the sub agent, we're going to
have it write the agent name to a file.
We're going to have it write the user's
query to a file and then we're going to
have it run this command.
You know, another alternative to this
would be to actually pass the agent name
and the query as command arguments. The
reason why we don't want to do that is
because of Codex's permissioning system.
As long as the command looks exactly the
same, you only have to grant permission
once. But if you have different
arguments to the command, you have to
approve it every time. So it gets really
annoying if you have to approve every
time that codeex wants to call sub
agent.
So in this case, we make the command
look exactly the same. Codex is just
going to run it.
Now, if you run again with dangerously
skip permissions or whatever, you don't
have to worry about this.
But all right, let's go in. Oh, then
we've got this also this wrapper script
around codeex. So, let's take a look at
that real quick. Super simple. Uh, what
it does is it takes the codeex home
files from your home directory. It's
going to sync them into a subdirectory
so it has access to them and it's going
to set codeex home to that directory.
And it's just going to launch codeex. In
this case, I'm launching in full auto
mode, which is just like shorthand for
workspace, write plus, I think, approval
on a request or something like that. I
can't remember which one. Um, but pretty
straightforward. Not much going on here.
Really not much code.
All right, let's go ahead and launch
this.
Okay,
now let's just give it just a quick
query.
I'm going to tell it to use its work
counter sub agent. Have it go off and do
that.
You're going to see it
figure out that it needs to run this
agent exec. It's going to go ahead and
put the name of the agent in a file.
It's going to put query in a file. Then
it's going to ask me for permissions to
run it. And it's really important here
that I say yes and don't ask again for
this command. That way it's not going to
ask me every time it has to run a sub
agent.
You'll notice that it's running
everything in serial here. Codeex does
not have the ability to run things
asynchronously like claw does. So, this
is slower. And Codeex in general, if
you've used it, I think you find it's
slower overall than than Cloud Code. But
I think that's really kind of
intentional. seems like Codex is really
kind of meant to be more of like a
hands-off unattended type of a tool
versus clog code is meant to be more
kind of iterative and so you know I
think that's actually okay. I found this
okay for me the way that I've used
codeex. All right so we can see we got
that result back printed to standard out
here and then
codeex just gave us back the answer. So,
let's just do one more with this file
writer sub agent.
Again, it's going to do the same thing.
It's going to write that agent name into
a file. It's going to write
the query into a file. Then, it's going
to call that same command. It will not
ask for permissions this time.
Oh, and we're using the timeout 600 here
because some of these agents can
actually take a long time to run. If
you're having it do a big task that's
going to have it look across a whole
codebase and you have a large codebase,
it can take up to 10 minutes. I've
actually seen them take longer, up to 20
minutes sometimes. So, you might even
want a longer timeout here. This is what
I set for this example. In this case,
this is a pretty easy one. So, it only
took about 40 seconds. All right. So, it
wrote the file. Just go ahead and verify
that.
All right.
All right, that's all I have. You can
find the code at that URL.
You can find betterup at betterup.com.
If you have any questions for me, you
can use my email address or you can DM
me on X. I don't post anything on X, so
really no reason to follow me, but go
ahead if you want. And I hope this was
helpful for you. And again, if BetterUp
sounds like an interesting place to you,
please hit me up. Have a great day.