Developing Taste in Coding Agents: Applied Meta Neuro-Symbolic RL — Ahmad Awais, CommandCode
Channel: aiDotEngineer
Published at: 2025-11-24
YouTube video id: kWOQS3XPZ10
Source: https://www.youtube.com/watch?v=kWOQS3XPZ10
Well, hello there. Today I am really, really excited to both launch and share with you what we have been working on for maybe over an year now. It's called Command Code, a coding agent with taste. So, who am I? Um, I am Ahmed, creator of Command Code, CEO and founder of Langbase. Um, I've been around this blog for I don't know like 20 years building one thing after another. I've written hundreds of open source packages with millions of downloads. Maybe you like my shades of purple code theme. I love the color purple and I I've I'm I'm an engineer at the end of the day. I write a lot of code and I've been building in the LLM space for about five years now. Um and I think the one of the first tools that I actually ended up building was a coding agent and at the end of the day like I'm very technical. I got to contribute to the NASA Mars Engineer helicopter mission. My code lives on Mars. So when I'm writing code, no matter what LLM or what coding agent I'm using, I want it to learn from me. I want to I wanted to learn that how I am editing its code. I wanted to understand my preferences and continuously adopt to that uh you know preference set in invisible architecture of choices that I have and that is what I'm excited to demo today. Right? So uh the story actually begins in 2020 uh when Greg Brookkeman gives me access to GP3 and I tell him like the one of the first things this is like three years before chat GPD and a year before uh you know GitHub copilot I tell him that I want to build something with GP3 that suggests suggests the next line of code right so let's jump into a demo right away right let's let's look at what this actually looks like and then I'll I'll probably explain you know how we ended up here so on On the left here you see uh you know cloud code and this is command code right this is what we are building as you can see it is continuously learning taste is on this is what we call it and uh I've been building a lot of CLI as you know like you know if you know anything about me you know that I'm all about automation and I have been building a lot of you know CLI over the course of my career so let's uh build a CLI and command here actually knows how I built a CLI yesterday right or before that it kind of understands my preferences of building a CLI. So let's give both of them uh this thing right uh make me a CLI that can tell date in ISO format right so look at what is happening here so one of the first things that happen here is uh command kind of picks up on my test file and I'm I'm going to share a little bit more about it but you see what is happening here and I'm going to probably you know enable all these settings so let's give both of these coding agents uh you know a steps on and you can see what command is doing it's it's using t-up it's using uh typescript and it's uh building an ASI art you know banner it's npm linking uh it's going to help LPM link this particular CLI as well and the these are all the things that I kind of care about and while cloud has done something really good it it's very fast but h I don't know man this is not what I wanted it's like a console log of uh uh this or that like I I I I I when I build a CLI, I don't want to build a CLI, you know, a CLI like this. I I want to build something like, you know, please uh use uh Typescript and I want TUP, right? Um and what else? I want uh Commander because I like to uh you know have more control over my CLIs. And what else? I want a lowercase uh version number uh with hyphen v because I know you know commander does this hyphen capital v thing like I have so many preferences here and by this time uh command has already done what I wanted it to do. How about we actually jump uh into code and see you know what it has actually done right let let's let's open this up into VS code and this is what command did for me right so it is using tap it is using typescript it knows pmppn uh that I prefer pnpm uh I completely forgot to tell that to uh claude and if we go into this particular uh CLI here uh you can see what it is kind kind of doing right like it is using hyphen v uh for version it is not like hard coding a package version in here and one more thing it should have picked up is like I want all of these commands to be in separate directory called commands so there you go the date command is here so when I grow this CLI into like you know tell me human date or whatnot it is going to put all of these commands here it's very very easy to test that way I wonder if it is also using vitest there you go because I prefer Vest for uh you know writing uh a lot of tests and one of the those things you know it it is using 0 0.0.1 version I like to start here instead of 1.0.0 right and that is probably not what you know uh uh claude was doing on this side right if I were to open the same uh CLI that claude built for me you will see that you know 1.0 O and it's like again not using vit like every single preference that I have it is probably not going to do that and then again this thing everything is here I don't want it like this uh this is kind of again it cla knows cloud is a is an amazing model but it knows what to do and with command right now we are also using cloud but it's it's kind of like I have to steer it so much that I kind of feel like it should be learning from me and by the way it's it is quite transparent And if you look at this, we have a command code folder in here. And if you see in here, there's a taste file. And if you go inside of it, there's a, you know, CLI taste that it has picked up. And these are all my preferences. I can assure you, none of this is written by me. So command code is continuously learning from me and it is creating a lot of these taste like things. This is not spec. This is not scale. It's like my intuition uh built into a metaano symbolic uh model, an architecture model that is more deterministic that kind of figures out it's more like a reix of my preferences and it figures out like this is what I want when I'm using and building uh you know with uh writing with AI code or whatnot. So let's step back in and let's take a step back uh why and how we got here right and I'm going to share we are going to publish a paper about it as well. I'm going to uh share a little bit more about like where we are and how we are going to think about it, why this kind of matters and what is the architecture behind all of this. So again I started in 2020. Uh the first thing I built was a coding agent and that led to so many things. I ended up building Langbase and we raised $5 million from all these amazing people. In fact, uh founder of GitHub uh led our uh round in you know founders of all these amazing company companies kind of supported uh you know our mission here and the idea that we were we were trying to fix was memory and this memory was not rag it was like a serverless rack store which can reason over your data reason over how to help you and continuously learn and we saw a lot of things like I think this is the biggest problem in AI I think the best thing that [laughter] AI has kind of learned from humans is that humans are lazy and that is what AI is. AI is lazy by default. It's very sloppy. If you ask for a you know horse on a staircase banister, this is kind of what you get and then you have to uh you know prompt it again and again and again to get to this left side of things. You know this is sort of what you saw me do with Claude when I was trying to build that CLI. Right? To fix this problem, we basically launched a bunch of primitives. of threads, workflows, memory, what have you. And our hope was that people will start building amazing agents. And then we saw uh you know like we doing like I think 700 terabytes and 1.2 billion agent runs a month. So we saw major scale but we saw another problem. We we studied that problem and you can go to stateofiagents.com. You can study all of our uh research into how people were building agents. This is all public by the way. [snorts] And we figured out like even agents uh were very way very sloppy like you know I'm like I think like I I use AI for everything except for when I am writing right because every time I build an agent uh to write or every time I use an LLM to write something this sort of slob I kind of get back right so we have a collaborative dev tool can you write me a fun headline for it and what I'd get back is like power of synergistic teamwork or whatn not and this is my friend I actually saw him do this and he was like, "Oh god, no. Please fix it." And it got even worse, right? Uh to fix this, we we tried this command. We launched it as chai. And rebranded to command new in last five months. This was an agent of agents. You would give it a prompt like this is the kind of agent I want to build. It will provision and create all of the infrastructure for you. And I shared a talk about it as well. In five months, we have seen 150,000 agents vip coded with it. But there's just something missing, right? Vibe coding I think is better than slob but it's not better than the rules and choices that I have made that I have kind of built my career around right so we started to fix this problem again and this is sort of again this is my five years of learning is around this I think by default AI is sloppy this is the default setting of almost every LLM they're trying to be correct and they're trying to be correct as soon as possible that I think doesn't really work with code and then we get this vibe coding thing where somebody does the context engineering you know everybody has a different name for it you know uh behind the scene it's context engineering memory and a bunch of prompts and you know you most of the times you don't really have a lot of control over it and to seek that control what a lot of developers do is they they start writing these rules files like cloudmd agents.mmd and rules are never enough I I I often tell I often joke about this that our justice system sucks because our rules are not enough and then we have to go out with this human lawyer and a human you know judge and a jury of humans to figure out what to do in that particular situation right so I feel like uh there should be something that is learning rules from us and it should be learning our taste of writing code and that is why I've put this thing taste here what what does that look like let me let me like uh like I I think this should be something that is acquiring our taste. So, uh, command code a coding agent with taste or if I if I'm bold enough to say it's it's something that is a coding agent with an acquired taste. It learns what is your taste of writing code. And this is sort of what it looks like. So, I know this might be a very silly and bad example. I didn't want to put a lot of text here. But when I look at this code which is AI generated, I'm like, no, no, no. This is not good. I want JavaScript uh object parameters. Anytime there are more than two parameters, I want that. But AI won't uh you know listen to me. LLMs won't know my preferences of this thing. So again, when I ask for make me a sum.js function, this is again a very dumbed down version of an example. U cloud code won't do what I want it to do. in command just naturally knows this is what I prefer because it has seen me go and edit AI code and fix it this way right and similarly we kind of saw this happen when I asked to build a date CLI this is you know claude basically started with here's a console and I had to tell it no I I want PNPM I want I want TypeScript and all of that fun stuff whereas command just kind of knows that I prefer commander I prefer all of those things that I just you know demoed earlier in this particular talk Okay. So to sum it up, I think when programmers talk about good code, they're not talking about code that is correct. They're talking about this invisible architecture of choices that they have made throughout the course of their career to make their code, you know, kind of like readable, maintainable and humane and more like, you know, you which is which is I think what is stopping me to write a lot of code. I want to generate my mission is like what if I could do a lot of things in one day. What if I can have like a thousand poll requests merged to main uh you know and my review time would just go down by 90% or 99%. If an LLM if a coding agent was doing what I wanted to do right if it is not just picking up some sloppy code from 2015 Stack Overflow and slapping it to you know every request I have and I don't have time to teach it all the rules. I can either write code or I can teach it to write code. I I cannot be the one who's uh you know telling it when I'm using Nex.js GS or oh no this even though those both those both of those are you know creating API route files what is the difference when I'm in this project and that project it should just learn that in this situation this is the confidence level it has around the conflicts that uh you know that arise from different rules and different projects right so I I I don't I don't think I can do that again this is this excites the hell out of me I think this is the invisible architecture of choices that every programmer is making and that is that is what we are trying to build here uh you know a meta neuros symbolic reasoning space with reinforcement learning. This is this is a very dumbed down version uh a formula of how we have set this objective. Uh if if you don't know trans you know neurosymbolic architecture is a more deterministic inexplainable architecture than transformers. Transformers are generative. They they they are very probabilistic right. So what we are trying to do here is we are trying to I think claude and GPD are good enough really they are really good and you can use whatever LLM with command code but that LLM will be combined with your taste which is built up o upon this meta neurosy symbolic space you can think of it like uh you know a reax of your uh you know choices in petrit right and we have a kale divergence loop here as you can see like if you do end up doing something wrong we want the lm to you know [clears throat] correct you as Well, it's it's this amazing continuous learning tool that is both learning from your explicit and your implicit feedback. And then again, it is creating that neuros symbolic space to enforce that invisible logic uh around your choices. The architecture that is in your head, it is in your brain like oh yeah, when I'm building uh you know a TypeScript project, this is the type of thing I do, right? that kind of thing that can never really like you you your brain can never really translate that into a you know rules file otherwise like you won't be writing code you'll be writing a lot of rules files right and then again uh at the end to use the new neural part the LLM part we have reflective context engineering which is self-aware which is continuously learning and adopting like oh this guy used to use meow for writing CLI and I don't know what happened but two months ago it's he switched to commander I'm talking about this guy by the way. This literally happened, right? And it will automatically update my rules, my uh learning from me, my taste that now Emmeth prefers to use commander over meow. I don't need to go and teach it. I should be writing code at I don't know god speed and it should be learning all of this from me. And over time we've believed that this will turn it uh into a skill of intuition that command code will have that you can share with your team. Our mission is to build a huge ecosystem around this. Imagine if you could if you really like a developer out there uh whose react uh uh you know code is amazing, right? I I love what Tanner is doing at Tenner St with ten stack, right? So what if I could have tanner taste when I'm writing React code? You can do that with command code. What if like one of the things that I have been using it a lot for like my design engineer has a much better design skill than I do. uh whenever I'm writing any kind of front-end code, I actually borrow the design engineer taste I have which is which is messy like all sort all those margins and paddings and uh amazing tiny little details in his taste that I don't need to now care about but my LLM in my command code my coding agent kind of puts that LLM and that meta neurosy symbolic design taste alongside my request like build me a model that does this but it does it with my design engineer st which is unbelievable right so uh this is this is this is where we are today uh today we are launching command code you can you can you know feel free to go to commandcode.ai AI you know check it out this is the very beginning of all of it and I think large language models have captured the world stacks everything out there all of the stack overflow and whatnot and I believe what we are building with taste models is the world's intuition right and their intentions right what do you intend to do and how do you generally do it what are the patterns what is your taste in that taste with your preferred LLM is I think the next frontier of coding, right? Taste I totally believe is going to really really speed up how we write code. really really create that neuros symbolic uh guard rails or your you know again invisible architecture of choices that you have as a team as a project as a famous library or I don't know maybe you are an enterprise who care about doing things in a particular way right that is the kind of thing that you would be able to build taste around and share it with uh uh you know as an open source taste or share it with uh just your team like for example uh for example if you go sign up. Uh again, this is very very new. Uh this is potentially it will look like right. Uh we've already kind of moved away from uh sharing all of this and we are figuring out I would love your help to figure out what is the right mix of uh having all of this metalarning uh you know uh be part of your projects. Right now it kind of ends up as more of a you know what should I say a transparent markdown file but it could exist in any which way. It's a metano symbolic space in a model that is continuously learning your preferences and we can dump that learning in any particular form. Right now this is potentially what it looks like. You should be able to you know npx taste and install my CLI taste and then you can use command code and the CLI that you will build will be very very close to you know how I would build that CLI using your favorite LLMs. So yeah, that's pretty much it. As you can as you can see, I am pretty excited. Uh you know, uh our our biggest gains that we have seen uh internally at Langbase are we have probably 10xed the amount of code that we are merging uh uh in our main repository, right, in our maiden branch, right? which is generally we joke about it like when we disagree and compare to main the amount of that happening has increased 10x and um I I I'm feeling a lot more confident uh when I'm reviewing a lot of code right so our review uh time for any kind of coding pull requests have gone down significantly and I can't wait to see you know what everybody out there builds with it again we're very excited we want that LLMs should continuously be learning from our taste of writing code and I would love to see uh you know what you build with command code u that's pretty much it uh feel free to reach out and uh maybe you know uh send me a tweet or post or whatever you call uh we call it these days uh and I would love to see you know what everyone builds this is me uh thanks for having me ciao peace