Hacking Subagents Into Codex CLI — Brian John, Betterup
Channel: aiDotEngineer
Published at: 2025-11-24
YouTube video id: 5eJqXtevlXg
Source: https://www.youtube.com/watch?v=5eJqXtevlXg
Hi everybody, my name is Brian John and I'm excited today to talk to you about hacking sub agents in the codeex CLI. So who am I? I'm a principal fullstack engineer. My current focus at work is AI enablement for R&D. So think helping our R&D team members get their work done faster and with higher quality using AI. The company I work for is BetterUp. It's an awesome place to work. We've been using AI since the very beginning. I've been there for over eight years now, which is longer than any place I've ever worked before. And our mission is to help people everywhere live their lives with better purpose, clarity, and passion. If that sounds interesting to you, you work want to work on cool stuff with LLMs, please hit me up. I'll add my contact info in the last slide. So, why would we want to hack sub agents into Codeex CLI? Well, I've been using Clog Code as my daily driver since the very beginning. It's a great tool. It's got tons of bells and whistles. It's got great models, and I use sub agents all the time. But I don't want to be locked in to one tool, and I really don't want to be locked in to one model family. I wanted to be able to use other tools, particularly codec CLI, because the models look really good and I want to be able to still use sub aents with them so that I can use my workflows with other tools. Context management. So, as you all know, sub agents are amazing for context management. The main agent can give a problem to a sub agent. It can go off, do its work, use its tokens, and pass just the answer back to the main agent. And all that context got used up by the sub agent doesn't end up in your main context window, which is incredible. And I don't think I have to say any more about this one. We've all seen this way too many times and it gets annoying. And I have to give credit where credit is due. This talk by Dex Hory changed the way that I work with AI. The workflows he proposes here I found to be really effective especially in working with large code bases. I'd recommend you check out this talk. He's also talking at AI engineer code this year and I recommend you check out that one too because I'm sure it's going to be great. All right, so let's talk about design. At the end of the day, a sub aent is really simple. It's just another instance of the main agent. So our design can also be really simple. In this case, we're going to have our parent codec session. We're going to have it run a script. It's just going to be a wrapper script that's going to kind of take care of like figuring out what agent to run. It's going to build the prompt, etc. It's going to kick off codeex exec. So that child codeex is going to run as the sub aent. It's going to respond to the prompt. It's going to do its work and it's going to write its answer into a file and then our wrapper script is going to read that file and it's going to print that result to standard out and give it back to the parent codec session. Pretty straightforward. Well, this is simple, so it should be easy, right? Well, that's what I thought too. And I started to get all these errors from Codeex when I tried it. Codex's sandbox really seems to not want to let you do this. Now, you can of course run it with dangerously skip permissions or whatever. I don't do that, but to get it to work with the normal set of permissions actually was really, really hard and I bang my head against the wall a long time trying to get this to work. So, figuring out the minimum required permissions is probably the hardest part about this. getting the combination just right. On the parent, you need at least sandbox of workspace, right, to be able to run the codeex command. You can always run that dangerously whatever whatever command if you want. Again, I don't really do that. The child process is a little bit trickier. The sandbox prevents its access to the OpenAI credentials in your home directory since it's outside of the workspace. the you need at least sandbox workspace write again so that it can write the file that the uh wrapper script is going to read and you need to disable this thing called the rollout recorder which is like a logging thing the just because the parent sandbox again it prevents file system access to any subcomands that are outside of the workspace All right, before we go any further, I have to give a quick note about security. Meta recently wrote a great paper called the agents rule of two that I think explains this really, really well. And what it says is there's three things you need to care about with your agent when it comes to security. whether it's processing untrustworthy input, whether it has access to sensitive systems or private data, and whether it can change state or communicate externally. In our case, we're not processing untrustworthy inputs. We do have access to sensitive systems or private data because we're probably working with a proprietary codebase. And it can change state and it also can can communicate externally. Now the state that it can change is really kind of dependent on your system. In my case, it's really not very high risk and the communication it does externally is just to OpenAI's API endpoint. So again, not a major risk, I would say. So that puts us in the lower risk category. But importantly, lower risk does not mean no risk. So your mileage may vary here. you need to make your own determination on if this is something you feel comfortable with. With that, let's move forward. All right. So, to get codeex to be able to use sub agents with this wrapper script and everything, we have to tell it how to run them. So in our agents MD we're going to have just a little bit of information here that tells codeex hey when I say use the whatever sub agent go and actually like run this script and you know with these commands or whatever and that's how you do it. Also we have to tell it when to run sub aents. So that would be you know when the user asks or just when you think helpful. Then we want to tell it what sub aents are available and what they do. All right, with that, let's do a quick demo. I've put together a really quick and small proof of concept repository. It's open source. You can go and take a look at it yourself. I'll have the URL at the end of the talk. Let's just take a look at what's in here. So, first of all, let's take a look at our agents. I just created a couple of toy agents here. Let's go take a look at them how they're defined. You can see here each agent has a name. It also has a reasoning effort. So, depending on what kind of work it's doing, you can give it a light, medium, you can give it a high reasoning effort, whatever you think is appropriate. Then you just give it, you know, the prompt for the agent. So very similar to kind of how claude code sub aents work. In this case, it's just counting words. You know, this other one is a file writer agent. Just going to take some text and put it in a file. Don't need much reasoning for that. All right. So now let's look at our wrapper script. It's really small, only 72 lines. basically just takes in the inputs. It's going to call this agent exeutor Python class, which I'll show in just a minute. Also very small, and it's going to return that uh the agent's output to standard out so that the main agent can see it. Let's look at that agent executive class. Not going to go through this whole thing. Again, it's pretty small. basically just kicks off the child sub agent with the proper permissions and with the right reasoning effort and it disables the rollout recorder all that kind of stuff just does all that for you. So pretty handy. One thing that I think I didn't cover you look at agents MD is it's kind of important here is this part. So when we're telling Codeex how to invoke the sub agent, we're going to have it write the agent name to a file. We're going to have it write the user's query to a file and then we're going to have it run this command. You know, another alternative to this would be to actually pass the agent name and the query as command arguments. The reason why we don't want to do that is because of Codex's permissioning system. As long as the command looks exactly the same, you only have to grant permission once. But if you have different arguments to the command, you have to approve it every time. So it gets really annoying if you have to approve every time that codeex wants to call sub agent. So in this case, we make the command look exactly the same. Codex is just going to run it. Now, if you run again with dangerously skip permissions or whatever, you don't have to worry about this. But all right, let's go in. Oh, then we've got this also this wrapper script around codeex. So, let's take a look at that real quick. Super simple. Uh, what it does is it takes the codeex home files from your home directory. It's going to sync them into a subdirectory so it has access to them and it's going to set codeex home to that directory. And it's just going to launch codeex. In this case, I'm launching in full auto mode, which is just like shorthand for workspace, write plus, I think, approval on a request or something like that. I can't remember which one. Um, but pretty straightforward. Not much going on here. Really not much code. All right, let's go ahead and launch this. Okay, now let's just give it just a quick query. I'm going to tell it to use its work counter sub agent. Have it go off and do that. You're going to see it figure out that it needs to run this agent exec. It's going to go ahead and put the name of the agent in a file. It's going to put query in a file. Then it's going to ask me for permissions to run it. And it's really important here that I say yes and don't ask again for this command. That way it's not going to ask me every time it has to run a sub agent. You'll notice that it's running everything in serial here. Codeex does not have the ability to run things asynchronously like claw does. So, this is slower. And Codeex in general, if you've used it, I think you find it's slower overall than than Cloud Code. But I think that's really kind of intentional. seems like Codex is really kind of meant to be more of like a hands-off unattended type of a tool versus clog code is meant to be more kind of iterative and so you know I think that's actually okay. I found this okay for me the way that I've used codeex. All right so we can see we got that result back printed to standard out here and then codeex just gave us back the answer. So, let's just do one more with this file writer sub agent. Again, it's going to do the same thing. It's going to write that agent name into a file. It's going to write the query into a file. Then, it's going to call that same command. It will not ask for permissions this time. Oh, and we're using the timeout 600 here because some of these agents can actually take a long time to run. If you're having it do a big task that's going to have it look across a whole codebase and you have a large codebase, it can take up to 10 minutes. I've actually seen them take longer, up to 20 minutes sometimes. So, you might even want a longer timeout here. This is what I set for this example. In this case, this is a pretty easy one. So, it only took about 40 seconds. All right. So, it wrote the file. Just go ahead and verify that. All right. All right, that's all I have. You can find the code at that URL. You can find betterup at betterup.com. If you have any questions for me, you can use my email address or you can DM me on X. I don't post anything on X, so really no reason to follow me, but go ahead if you want. And I hope this was helpful for you. And again, if BetterUp sounds like an interesting place to you, please hit me up. Have a great day.