Developing Taste in Coding Agents: Applied Meta Neuro-Symbolic RL — Ahmad Awais, CommandCode

Channel: aiDotEngineer

Published at: 2025-11-24

YouTube video id: kWOQS3XPZ10

Source: https://www.youtube.com/watch?v=kWOQS3XPZ10

Well, hello there. Today I am really,
really excited to both launch and share
with you what we have been working on
for maybe over an year now. It's called
Command Code, a coding agent with taste.
So, who am I? Um, I am Ahmed, creator of
Command Code, CEO and founder of
Langbase. Um, I've been around this blog
for I don't know like 20 years building
one thing after another. I've written
hundreds of open source packages with
millions of downloads. Maybe you like my
shades of purple code theme. I love the
color purple and I I've I'm I'm an
engineer at the end of the day. I write
a lot of code and I've been building in
the LLM space for about five years now.
Um and I think the one of the first
tools that I actually ended up building
was a coding agent and at the end of the
day like I'm very technical. I got to
contribute to the NASA Mars Engineer
helicopter mission. My code lives on
Mars. So when I'm writing code, no
matter what LLM or what coding agent I'm
using, I want it to learn from me. I
want to I wanted to learn that how I am
editing its code. I wanted to understand
my preferences and continuously adopt to
that uh you know preference set in
invisible architecture of choices that I
have and that is what I'm excited to
demo today. Right? So uh the story
actually begins in 2020 uh when Greg
Brookkeman gives me access to GP3 and I
tell him like the one of the first
things this is like three years before
chat GPD and a year before uh you know
GitHub copilot I tell him that I want to
build something with GP3 that suggests
suggests the next line of code right so
let's jump into a demo right away right
let's let's look at what this actually
looks like and then I'll I'll probably
explain you know how we ended up here
so on On the left here you see uh you
know cloud code and this is command code
right this is what we are building as
you can see it is continuously learning
taste is on this is what we call it and
uh I've been building a lot of CLI as
you know like you know if you know
anything about me you know that I'm all
about automation and I have been
building a lot of you know CLI over the
course of my career so let's uh build a
CLI and command here actually knows how
I built a CLI yesterday right or before
that it kind of understands my
preferences of building a CLI. So let's
give both of them uh this thing right uh
make me a CLI that can tell date in ISO
format
right so look at what is happening here
so one of the first things that happen
here is uh command kind of picks up on
my test file and I'm I'm going to share
a little bit more about it but you see
what is happening here and I'm going to
probably you know enable all these
settings so let's give both of these
coding agents
uh you know a steps on and you can see
what command is doing it's it's using
t-up it's using uh typescript and it's
uh building an ASI art you know banner
it's npm linking uh it's going to help
LPM link this particular CLI as well and
the these are all the things that I kind
of care about and while cloud has done
something really good it it's very fast
but h I don't know man this is not what
I wanted it's like a console log of uh
uh this or that like I I I I I when I
build a CLI, I don't want to build a
CLI, you know, a CLI like this. I I want
to build something like, you know,
please uh use uh Typescript and I want
TUP, right? Um and what else? I want uh
Commander because I like to uh you know
have more control over my CLIs. And what
else? I want a lowercase
uh version number uh with hyphen v
because I know you know commander does
this hyphen capital v thing like I have
so many preferences here and by this
time uh command has already done what I
wanted it to do. How about we actually
jump uh into code and see you know what
it has actually done right let let's
let's open this up into VS code
and this is what command did for me
right so it is using tap it is using
typescript it knows pmppn uh that I
prefer pnpm uh I completely forgot to
tell that to uh claude and if we go into
this particular uh CLI here uh you can
see what it is kind kind of doing right
like it is using hyphen v uh for version
it is not like hard coding a package
version in here and one more thing it
should have picked up is like I want all
of these commands to be in separate
directory called commands so there you
go the date command is here so when I
grow this CLI into like you know tell me
human date or whatnot it is going to put
all of these commands here it's very
very easy to test that way I wonder if
it is also using vitest there you go
because I prefer Vest for uh you know
writing uh a lot of tests and one of the
those things you know it it is using 0
0.0.1
version I like to start here instead of
1.0.0
right and that is probably not what you
know uh uh claude was doing on this side
right if I were to open the same uh CLI
that claude built for me you will see
that you know 1.0 O and it's like again
not using vit like every single
preference that I have it is probably
not going to do that and then again this
thing everything is here I don't want it
like this uh this is kind of again it
cla knows cloud is a is an amazing model
but it knows what to do and with command
right now we are also using cloud but
it's it's kind of like I have to steer
it so much that I kind of feel like it
should be learning from me and by the
way it's it is quite transparent And if
you look at this, we have a command code
folder in here. And if you see in here,
there's a taste file. And if you go
inside of it, there's a, you know, CLI
taste that it has picked up. And these
are all my preferences. I can assure
you, none of this is written by me. So
command code is continuously learning
from me and it is creating a lot of
these taste like things. This is not
spec. This is not scale. It's like my
intuition uh built into a metaano
symbolic uh model, an architecture model
that is more deterministic that kind of
figures out it's more like a reix of my
preferences and it figures out like this
is what I want when I'm using and
building uh you know with uh writing
with AI code or whatnot. So let's step
back in and let's take a step back uh
why and how we got here right and I'm
going to share we are going to publish a
paper about it as well. I'm going to uh
share a little bit more about like where
we are and how we are going to think
about it, why this kind of matters and
what is the architecture behind all of
this. So again I started in 2020. Uh the
first thing I built was a coding agent
and that led to so many things. I ended
up building Langbase and we raised $5
million from all these amazing people.
In fact, uh founder of GitHub uh led our
uh round in you know founders of all
these amazing company companies kind of
supported uh you know our mission here
and the idea that we were we were trying
to fix was memory and this memory was
not rag it was like a serverless rack
store which can reason over your data
reason over how to help you and
continuously learn and we saw a lot of
things like I think this is the biggest
problem in AI I think the best thing
that [laughter] AI has kind of learned
from humans is that humans are lazy and
that is what AI is. AI is lazy by
default. It's very sloppy. If you ask
for a you know horse on a staircase
banister, this is kind of what you get
and then you have to uh you know prompt
it again and again and again to get to
this left side of things. You know this
is sort of what you saw me do with
Claude when I was trying to build that
CLI. Right? To fix this problem, we
basically launched a bunch of
primitives. of threads, workflows,
memory, what have you. And our hope was
that people will start building amazing
agents. And then we saw uh you know like
we doing like I think 700 terabytes and
1.2 billion agent runs a month. So we
saw major scale but we saw another
problem. We we studied that problem and
you can go to stateofiagents.com.
You can study all of our uh research
into how people were building agents.
This is all public by the way. [snorts]
And we figured out like even agents uh
were very way very sloppy like you know
I'm like I think like I I use AI for
everything except for when I am writing
right because every time I build an
agent uh to write or every time I use an
LLM to write something this sort of slob
I kind of get back right so we have a
collaborative dev tool can you write me
a fun headline for it and what I'd get
back is like power of synergistic
teamwork or whatn not and this is my
friend I actually saw him do this and he
was like, "Oh god, no. Please fix it."
And it got even worse, right? Uh to fix
this, we we tried this command. We
launched it as chai. And rebranded to
command new in last five months. This
was an agent of agents. You would give
it a prompt like this is the kind of
agent I want to build. It will provision
and create all of the infrastructure for
you. And I shared a talk about it as
well. In five months, we have seen
150,000 agents vip coded with it. But
there's just something missing, right?
Vibe coding I think is better than slob
but it's not better than the rules and
choices that I have made that I have
kind of built my career around right so
we started to fix this problem again and
this is sort of again this is my five
years of learning is around this I think
by default AI is sloppy this is the
default setting of almost every LLM
they're trying to be correct and they're
trying to be correct as soon as possible
that I think doesn't really work with
code and then we get this vibe coding
thing where somebody does the context
engineering you know everybody has a
different name for it you know uh behind
the scene it's context engineering
memory and a bunch of prompts and you
know you most of the times you don't
really have a lot of control over it and
to seek that control what a lot of
developers do is they they start writing
these rules files like cloudmd
agents.mmd and rules are never enough I
I I often tell I often joke about this
that our justice system sucks because
our rules are not enough and then we
have to go out with this human lawyer
and a human you know judge and a jury of
humans to figure out what to do in that
particular situation right so I feel
like uh there should be something that
is learning rules from us and it should
be learning our taste of writing code
and that is why I've put this thing
taste here what what does that look like
let me let me like uh like I I think
this should be something that is
acquiring our taste. So, uh, command
code a coding agent with taste or
if I if I'm bold enough to say it's it's
something that is a coding agent with an
acquired taste. It learns what is your
taste of writing code. And this is sort
of what it looks like. So, I know this
might be a very silly and bad example. I
didn't want to put a lot of text here.
But when I look at this code which is AI
generated, I'm like, no, no, no. This is
not good. I want JavaScript uh object
parameters. Anytime there are more than
two parameters, I want that. But AI
won't uh you know listen to me. LLMs
won't know my preferences of this thing.
So again, when I ask for make me a
sum.js function, this is again a very
dumbed down version of an example. U
cloud code won't do what I want it to
do. in command just naturally knows this
is what I prefer because it has seen me
go and edit AI code and fix it this way
right and similarly we kind of saw this
happen when I asked to build a date CLI
this is you know claude basically
started with here's a console and I had
to tell it no I I want PNPM I want I
want TypeScript and all of that fun
stuff whereas command just kind of knows
that I prefer commander I prefer all of
those things that I just you know demoed
earlier in this particular talk Okay.
So to sum it up, I think when
programmers talk about good code,
they're not talking about code that is
correct. They're talking about this
invisible architecture of choices that
they have made throughout the course of
their career to make their code, you
know, kind of like readable,
maintainable and humane and more like,
you know, you which is which is I think
what is stopping me to write a lot of
code. I want to generate my mission is
like what if I could do a lot of things
in one day. What if I can have like a
thousand poll requests merged to main uh
you know and my review time would just
go down by 90% or 99%. If an LLM if a
coding agent was doing what I wanted to
do right if it is not just picking up
some sloppy code from 2015 Stack
Overflow and slapping it to you know
every request I have and I don't have
time to teach it all the rules. I can
either write code or I can teach it to
write code. I I cannot be the one who's
uh you know telling it when I'm using
Nex.js GS or oh no this even though
those both those both of those are you
know creating API route files what is
the difference when I'm in this project
and that project it should just learn
that in this situation this is the
confidence level it has around the
conflicts that uh you know that arise
from different rules and different
projects right so I I I don't I don't
think I can do that again this is this
excites the hell out of me I think this
is the invisible architecture of choices
that every programmer is making and that
is that is what we are trying to build
here uh you know a meta neuros symbolic
reasoning space with reinforcement
learning. This is this is a very dumbed
down version uh a formula of how we have
set this objective. Uh if if you don't
know trans you know neurosymbolic
architecture is a more deterministic
inexplainable architecture than
transformers. Transformers are
generative. They they they are very
probabilistic right. So what we are
trying to do here is we are trying to I
think claude and GPD are good enough
really they are really good and you can
use whatever LLM with command code but
that LLM will be combined with your
taste which is built up o upon this meta
neurosy symbolic space you can think of
it like uh you know a reax of your uh
you know choices in petrit right and we
have a kale divergence loop here as you
can see like if you do end up doing
something wrong we want the lm to you
know [clears throat] correct you as
Well, it's it's this amazing continuous
learning tool that is both learning from
your explicit and your implicit
feedback. And then again, it is creating
that neuros symbolic space to enforce
that invisible logic uh around your
choices. The architecture that is in
your head, it is in your brain like oh
yeah, when I'm building uh you know a
TypeScript project, this is the type of
thing I do, right? that kind of thing
that can never really like you you your
brain can never really translate that
into a you know rules file otherwise
like you won't be writing code you'll be
writing a lot of rules files right and
then again uh at the end to use the new
neural part the LLM part we have
reflective context engineering which is
self-aware which is continuously
learning and adopting like oh this guy
used to use meow for writing CLI and I
don't know what happened but two months
ago it's he switched to commander I'm
talking about this guy by the way. This
literally happened, right? And it will
automatically update my rules, my uh
learning from me, my taste that now
Emmeth prefers to use commander over
meow. I don't need to go and teach it. I
should be writing code at I don't know
god speed and it should be learning all
of this from me. And over time we've
believed that this will turn it uh into
a skill of intuition that command code
will have that you can share with your
team. Our mission is to build a huge
ecosystem around this. Imagine if you
could if you really like a developer out
there uh whose react uh uh you know code
is amazing, right? I I love what Tanner
is doing at Tenner St with ten stack,
right? So what if I could have tanner
taste when I'm writing React code? You
can do that with command code. What if
like one of the things that I have been
using it a lot for like my design
engineer has a much better design skill
than I do. uh whenever I'm writing any
kind of front-end code, I actually
borrow the design engineer taste I have
which is which is messy like all sort
all those margins and paddings and uh
amazing tiny little details in his taste
that I don't need to now care about but
my LLM in my command code my coding
agent kind of puts that LLM and that
meta neurosy symbolic design taste
alongside my request like build me a
model that does this but it does it with
my design engineer st which is
unbelievable right so uh this is this is
this is where we are today uh today we
are launching command code you can you
can you know feel free to go to
commandcode.ai AI you know check it out
this is the very beginning of all of it
and I think large language models have
captured the world stacks everything out
there all of the stack overflow and
whatnot and I believe what we are
building with taste models is the
world's intuition right and their
intentions right what do you intend to
do and how do you generally do it what
are the patterns what is your taste in
that taste with your preferred LLM
is I think the next frontier of coding,
right? Taste I totally believe is going
to really really speed up how we write
code. really really create that neuros
symbolic uh guard rails or your you know
again invisible architecture of choices
that you have as a team as a project as
a famous library or I don't know maybe
you are an enterprise who care about
doing things in a particular way right
that is the kind of thing that you would
be able to build taste around and share
it with uh uh you know as an open source
taste or share it with uh just your team
like for example uh for example if you
go sign up. Uh again, this is very very
new. Uh this is potentially it will look
like right. Uh we've already kind of
moved away from uh sharing all of this
and we are figuring out I would love
your help to figure out what is the
right mix of uh having all of this
metalarning uh you know uh be part of
your projects. Right now it kind of ends
up as more of a you know what should I
say a transparent markdown file but it
could exist in any which way. It's a
metano symbolic space in a model that is
continuously learning your preferences
and we can dump that learning in any
particular form. Right now this is
potentially what it looks like. You
should be able to you know npx taste and
install my CLI taste and then you can
use command code and the CLI that you
will build will be very very close to
you know how I would build that CLI
using your favorite LLMs. So yeah,
that's pretty much it. As you can as you
can see, I am pretty excited. Uh you
know, uh our our biggest gains that we
have seen uh internally at Langbase are
we have probably 10xed the amount of
code that we are merging uh uh in our
main repository, right, in our maiden
branch, right? which is generally we
joke about it like when we disagree and
compare to main the amount of that
happening has increased 10x and um I I
I'm feeling a lot more confident uh when
I'm reviewing a lot of code right so our
review uh time for any kind of coding
pull requests have gone down
significantly and I can't wait to see
you know what everybody out there builds
with it again we're very excited we want
that LLMs should continuously be
learning from our taste of writing code
and I would love to see uh you know what
you build with command code u that's
pretty much it uh feel free to reach out
and uh maybe you know uh send me a tweet
or post or whatever you call uh we call
it these days uh and I would love to see
you know what everyone builds this is me
uh thanks for having me ciao peace