Mergeable by default: Building the context engine to save time and tokens — Peter Werry, Unblocked

Channel: aiDotEngineer

Published at: 2026-05-03

YouTube video id: 5ID22ACI7IM

Source: https://www.youtube.com/watch?v=5ID22ACI7IM

All
right, thanks everyone. Sorry about the
wait. Um, this is going to be a a bit of
a strange session because um there is a
workshop component to this. So, uh I
guess everyone will be coding on their
laps. Sorry. Um but anyways, sorry. I'm
Peter and uh this is my colleague
Brandon. Um so we're we're going to
break this this session into two
different parts. One is a um sort of a
talk that I'm going to give about what
context engines are are useful for and
and how you might go about building one,
what to think about. Um and then we'll
we'll launch into the u the second part
of it. So um just briefly
um quick quick agenda. We're going to
talk about three myths that are
circulating uh right now about about
context engines and then I'll go over a
couple of less or a few lessons uh that
we learned along the way building one of
these things. Um and then finally we'll
do this. So we're going to build a
social engineering graph. Uh this is a
component that is super useful in a
context engine. And first just a show of
hands. Does everyone know what I mean by
context engine or does anyone want
clarification on that?
>> Okay. So uh in the world of AI agents
uh you have agents that uh when you
start off and you start coding they are
basically at ground zero. They have no
context about your code, your
organization, nothing. Okay. So
typically what happens is the first
thing they do is they start to rip
around your codebase uh based on the
task that you give them to try to gain
some understanding uh sort of background
understanding before they start to do
their task. So context engineering is is
kind of the art of supplying uh all the
context that you need and most
importantly none of the context that you
don't need in a highly optimized way so
that when the agent starts to run it
executes the task uh in a streamlined
way that's in line with your
organization's best practices and
expectations and so on. Okay, so we'll
get to this after.
So, not long ago,
as in like four years ago or less, uh
you were the context engine. Okay? So,
when when your agent needed something,
um you would prompt it, you'd grab the
the issue ticket, you'd hand it all of
the information that it needed to start
its task. And in many cases, even when
it was ripping around getting background
contexts, when it got to the end of its
task, sometimes it it got it wrong. In
fact, many times it did. and you'd have
to kind of like reset it. Um, reguide it
towards the solution that you were
thinking of. Um, or if it completely
missed the mark, you'd have to be like,
"No, not the not the JavaScript dummy.
It's the Python source code that I want
you to look at." Um, so
let's just remember how you built
context in an organization. Uh, so we're
taking AI out of the picture for it for
for just a sec. And I just want you to
pretend that pre- AAI uh you just joined
an organization, let's remember how we
built it up. So
over time, you would accumulate this
kind of context through experience,
right? You would start a job, maybe
start code splunking a little bit to
figure out um uh how things work. You'd
maybe latch on to a mentor. Um, and
eventually you'd you'd experience real
things like incidents and outages and
things like that. Those are sort of the
the pain things that stick with you.
Those are the battle scars, right? And
that is what constitutes organizational
context. It's the um it's the learnings
along the way, the why did we do things
the way we did it.
And now you're good at your job because
uh after all of that experience of pain,
now you know what questions to ask. You
know where to look when an incident
happens. And this is the goal. This is
what we want to get for our AI agents.
So um I'm going to just lift this
adoption curve from Vimath. And uh I I
may have butchered his last name, but
sorry uh Vim if you see this. Um so
let's let's start at the beginning here.
This was like four years ago in 2022.
Everyone everyone remembers fancy
autocomplete, right? Um so back in those
days context windows and AI were pretty
limited. I'm not sure if everyone even
remembers this, but it was like 8
kilobytes or or 8k tokens I should say.
And that's not a ton. And so tokens were
highly optimized and um agentic idees
like cursor focused just on the code
that surrounded uh the code that you
wanted to go in and autocomplete. So
basically they took some code before
they took some code after they put it
into a model and they said this user is
working on this piece of code what's the
most likely next thing and that's what
was printed out. Um it got progressively
better than that. as were uh were
integrated uh language servers and then
it you you were able to basically pull
like collers of source code and pull all
that into context and then the LLMs were
really good at at completing uh code. Um
so
at those levels you were the context
engine and uh in in many in many
circumstances here this is kind of where
most people are here. They're at the uh
uh parallel agents hooked up with MCP
and skills. Okay. Just super curious,
has anyone gone beyond curated context
into the the last uh few degrees of of
agentic freedom, shall we say, where you
have background agents running in the
cloud doing stuff in YOLO mode. Is
anyone anyone experimenting with that?
Okay, cool. That's that's very cool.
That's bleeding edge. Um, but let's just
take a moment to recognize that bleeding
edge today is like yesterday's news in
six months. Okay. So, the the puck I'm
Canadian, so I'm going to say this, the
puck is going down down the line towards
background agents for sure. Um, and one
of the things that we run into right now
is this. Uh, we're becoming the
bottleneck as humans, right? I'm not
sure if if people have tried managing
parallel agents and uh working on
several tasks at once, but everyone's
starting to feel this like cognitive
disconnect because you're context
switching all the time and it's just
it's just really really painful. Um, it
is very difficult to move from that mode
where you're the human managing context
into the background agents mode unless
you have some kind of context engine
that knows how your code operates, how
your organization works and understands
the motivations for historical changes
and things like that.
So, Andrew, Andre, he nailed it. Um,
systems are intelligent. We're reaching
the exponential on on intelligence for
code pretty soon. Everyone's seen the
the release about Mythos. Um even though
we all haven't had a chance to really
try it out yet. Uh the promise is that
from a code intelligence perspective,
this thing is like pretty much close to
to perfect. Um but so now the bottleneck
is context. Of course,
without um without context, I'm just
going to re-emphasize this point. you'll
probably end up in doom loops. Does
everyone know what a doom loop is? A
doom loop is like when you're uh you're
struggling with the agent. It's it's not
quite doing what you want and you have
to keep iterating on it. The worst case
scenario is you run this thing in yolo
mode and it finishes the entire task and
it's completely wrong. You have to go
back and correct, you know, various
stages. Um so when you have a context
engine, you can get there faster. The
problem is that access doesn't equal
understanding.
So we have customers that are on various
parts of I'm just going to go back to
here. We have customers that are on
various parts of this journey.
Um and one of the one of the interesting
things that we've noted is that uh
people feel that you know they're
they're they understand their
organization best. So when it comes to
feeding the right context to these
agents, people will try to build some
semblance of what a context engine
actually is. They'll maybe build a rag
system or they'll build some way to like
feed organizational data to an agent. Um
so unfortunately though, uh access
doesn't mean understanding. So, what
that means is you could just wire up a
bunch of MCP servers. Um, and it's not
going to be able to understand what the
relationships are uh between all that
data, how it was, how it got there, and
how why it is the way it is. Um, and
then there's another problem which I'll
talk about a little bit later called
satisfaction of search. So, just
remember that term. I'll come back to
it. Um, okay.
So I just wanted to show you this. Um
this was something that uh we actually
implemented and we did it in in two
parts. One was just without any context
engine but wired up to a bunch of MCP
servers. It did a pretty good job. Um
but then when we reached the end um it
it it missed the fact that we had some
legacy stuff that depended on this old
um method of um of intelligence size to
to anthropics. So they have adaptive
thinking now but it used to be you had
to supply a a token budget and that's
how like you could increase the size of
the thinking window. Um, so we we had
some code that kind of like depended on
this and there were reasons for that um
that the agent didn't understand or see
and so it just basically clogged all
that code. Um, but when we added the
context engine then it saw all those
reasons and implemented it the right
way. So it made the appropriate changes
in the right places, included backwards
compatibility for the code that was
using the old method.
Okay, so now for the myths.
Myth one, naive rag over my docs is a
context engine. Um, so
if you implement um say like vector
search um or just a couple of search
methods
uh you're going to run into this you're
going to run into a few issues. One is
this satisfaction of search problem
where uh the agent will search like
crazy consume your tokens and then in
the worst case you'll reach compaction.
Okay. So, um without being able to find
the the endgame,
um there are a few different other
techniques like you you need to have
personalization when you build a
retrieval system because if you just rag
all your data, especially for very large
organizations, there's going to be
things like conflicts that you have to
resolve in the data. Um it won't be
focused on the task that you're trying
to perform. it might pull in, you know,
relevant code from other parts of your
organization that especially if if you
have a really big org and you've got
tons of different repos. Um, it's it's
just going to create a huge mess. So,
you need to have some element of
personalization.
And then here again, connect a bunch of
MCPs. I'm just going to reiterate this
point. I'm done. Nope, definitely not.
Um, so that that is the thing that that
really um puts an emphasis on the
satisfaction of search point. And I'll
explain that in a sec.
And finally, a bigger context window
will solve this. Um, so way back, you
know, when the models were starting to
get big, people were really excited
about a million tokens in your context
window. Uh, the first models that tried
this, I think it might have been Claude,
actually. Was it Claude?
>> I think it was
>> or OpenAI. Okay. Okay. Gemini.
>> Gemini. Yes. Sorry. I'm so sorry. Um, so
yeah, Gemini first first model to try
this and it was really good at finding
needle in the hay stack. So you could
feed like a huge document to it and as
long as you you knew what you were
looking for ahead of time, it could find
it. Um, but it wasn't good at all at
reasoning across different data sources,
um, understanding the real meaning
behind a problem and then recommending
the appropriate solutions. So none of
that was possible. Um, obviously things
have things have gotten much better.
Now, the problem is most organizations
have more than a million tokens worth of
context. So, trying to fit all that into
the context window isn't going to work
anyways. Let's project out to the future
and imagine that you could fit like 10
million tokens, 50 million tokens. Um,
at the current uh rate of memory
consumption um just to operate the
models, that's not going to be possible
for a really long time. even if it was
and you fit all that context in your
context window, you're still going to
run into problems with understanding
what's true, what's false, um how to
select the right information. Okay, so
now I'm going to come back to this
second point here, satisfaction of
search. This is a a term that actually
comes out of uh the medical field in
radiology. And the idea is that um when
techs are looking at x-rays uh and
they're looking for uh the cause of of
of symptoms, they might find something
on the x-ray that explains those
symptoms and then they stop. Um and
that's that's kind of like a dangerous
thing medically because uh there might
be other indicators for things like
cancer that get missed. So, uh,
satisfaction of search is a is a real
problem in radiology and there's lots of
protocols to prevent just stopping as
soon as you find the first thing. Um,
this is what happens with agents when
they search around in say uh, notion and
your code uh, confluence, they'll
stumble across what looks like the the
thing they're looking for and they'll
stop and then they'll they'll proceed.
But the the real like golden nuggets of
information might be in a different
place that the agent wouldn't think to
look like in a in a past Slack
conversation or in an incident report,
something like that.
So here's the the classic iceberg meme.
Um
code that compiles. That's like the
baseline. Does the agent produce code
that compiles? Um but everything that
that is actually important is happens
underneath here. So understanding the
user's original intent um what was
rejected in the past by the team and
tried before but failed. Uh how are you
going to surface that kind of content
just by looking at docs and and code and
stuff? Um so you need to understand that
somehow.
Um and even worse like it it's it's
sometimes hard to know uh when things
were deleted like in the absence of
information. So you need history as well
leading up to decisions.
So this is why we think you need a
context engine. Uh a context engine
understands who you are, what team you
work on, who you work with, who the
experts are in your organization,
um and and what the decisions were that
led up to the current iteration of your
codebase. it's able to resolve
conflicts. Uh so this is like a truth
and false type situation. What's true,
what's not. Um sometimes that truthiness
is a gray area, right? So the context
engine needs to also understand when to
instruct the agent um that it wasn't
able to resolve a conflict and then uh
learn from additional user input.
Um this third point is super important
of course in any large organization or
enterprise. Um there's often you know
repositories that not everybody can
access secret projects that sort of
thing. So uh it's really important that
you flow the access controls up. We have
I'll give you an example that everyone
will appreciate which is Slack. Um our
contact engine integrates with Slack or
Microsoft Teams. Um, and when you have
uh private channels that that's really
highly sensitive, right? Like you could
be discussing HR information or uh maybe
something that you just really don't
want um everyone else to see. And so
when when unblocked answers questions,
uh it will use private channel
information, but it won't it it will
only use that information if the person
that's asking the question has access to
it. And then those answers are not
public. Okay? So they're they're private
to you.
Um, and then finally, of course,
delivering the right contest at the
right time. And this is about token
efficiency. It's about getting to the
answer as quickly as possible.
So, here's a kind of a highlevel
overview of how how a context engine
might work. Um, on the left we've got
data source inputs. So, things like
planning tools, docs, conversations,
code, PRs, basically like anything
that's relevant to getting work done at
the engineering level. Um and then on
the right side we have the outputs. So
you know th this all can flow to coding
agents CP or CLI tools. Um you can
custom build apps through the API. We've
got a we have a code review uh component
that just plugs right into your SCM and
provides code reviews and of course uh
integrations with social messaging apps.
So the these are the kind of like broad
six requirements that we think are
important. There's actually much more
than this but these are the highle
things. So again unified system contexts
um this is about
building relationships between data.
Okay. But it's more than just um
recognizing when uh one piece of data is
related to another. Like for example, in
in Slack, you might have conversations
about PRs. That's an easy linkage
because posting links back and forth. So
that's easy. Um what's less easy is
understanding uh the reason why
decisions were made or your
organization's best practices, right? So
to understand that you have to go a
little deeper. um do do things like
distill um historical poll request
comments on PRs and uh try to distill
those down to the their core essence and
then when you see repeated patterns uh
you can pull those patterns together and
store them as you know quote unquote
memories so that when uh someone is
working on a similar piece of code you
can load those memories and then the
agent can see that and go oh yeah right
uh this is the way this organization
does this particular thing.
Um, conflict resolution super important.
Um, we took a initially kind of a naive
approach to this at first and based it
just on recency, right? So, we we would
bias towards newer stuff. Uh,
unfortunately, when you have in the
fullness of all your context, recency is
not enough. Um, often you have people
um, writing documents or chatting in in
in their messaging platforms and they
might be saying things that are not like
completely aligned with uh, how the the
system works. Um, uh, so you know then
we started to bias towards code. So, we
had recency and we're like the main
branch is definitely your source of
truth, but not always because sometimes
um what's important is what happens
next, not the way a system currently
works. Like when when you're working on
a task um what you really want is for
the agent to understand where you're
going, not necessarily where you've
been. Where you've been helps it
understand what not to do. where you're
going helps it understand what you
should do. Okay. So, in in the Slack
case, looking at the conversations that
your organization's experts are having
is more important than just
understanding what you know, every
random engineer is talking about.
Um, targeted retrieval and personal
relevance are very related. So, I'll
just talk about them uh together
briefly. So, um, again, like when you're
pulling context in, it's important that,
uh, you're only pulling context in for
the relevant task at hand and probably
relevant to you. So, here's a technique
that's kind of interesting. um you can
understand what repos a person works on
most by the number of PRs they submit
contributions and then if you do a if
you're doing vector retrieval you can do
a uh deep retrieval on those focused
repositories and then a wider retrieval
on you know the rest of the source code
and then sort of bias the the selection
towards uh the focused repositories
because that's more likely where
someone's going to be working and
spending their
Um, and then you know, we've talked
about data governance, so I don't think
I need to go over that again. Super
important though.
This was just a a little experiment that
we ran uh with a larger task. Um, I'm I
fully admit that some of these numbers
are a bit wonky. This is basically like
Claude outputting numbers. So, don't
don't trust it. Just trust the the vibe
of the thing and not necessarily the
numbers. Um, basically what it's saying
is that when we started out uh without
the MCP server act or sorry without the
context engine active um it it really
missed the mark on a lot of stuff. Uh
and that's just because it didn't
understand how um the existing
implementation really worked and why it
was the way it was, what was tried
before and failed. Um and so it made a
lot of those same mistakes. uh with a
context engine turned turned on
obviously it it um it nailed it. The the
key numbers though are the the time and
the tokens that it took. So without um
the context engine took two and a half
hours to finish this task with 21
million tokens which is a lot of tokens.
Um but with the context engine it took
only 25 minutes and 10 million tokens.
So it's it's a pretty dramatic
difference.
Um okay so the hard lessons these are
just samples by the way but the these
are ones that we thought were kind of
interesting. So first of all uh
initially we optimized for access not
understanding. So we our our first
premise was if we just wire up a bunch
of tools
um and provide a a knowledge graph it
will be able to traverse the knowledge
graph and uh execute a bunch of
retrieval specific tools for particular
integrations and so on and figure
everything out. Um that does not work.
So uh you'll have to go a little bit
deeper than that.
Uh second one is we hid conflicts
instead of surfacing them. So um by con
by hiding conflicts I don't mean that we
just ignored the conflicts. What we did
instead was we tried to resolve those
conflicts using those naive strategies
and we didn't surface the conflicts that
we weren't able to resolve. So this was
a really good learning is that um a
context engine I mean we'll get there
eventually probably but uh it can't
always tell uh what the truth elements
are and when it can't you should surface
that and learn from it. That's the key
thing.
And then finally I think a lot of folks
tried this. This is a really bad idea.
So when when a context engine supplies
an answer um do not cache the answer and
try to serve that same answer up again
uh to a similar question. The reason is
obvious is is fairly obvious in
retrospect but um everything changes
constantly right code changes docs
change the reason for things change. So
this just doesn't work. Um the other
thing is if you try to uh use the the
previous answers as context for new
answers, you regress towards a mean. So
if the model is like misbehaving or
doing something bad and you continuously
bring that into context, you're
obviously going to pollute uh the
context.
And this is what happens.
Okay. So let's now talk about where AI
forward teams like like those that are
doing this like cloud-based agent thing
are are using and taking advantage of
context engines.
Um definitely and especially during the
planning phase. Okay, this is where you
get the biggest bang for buck
unquestionably. Um get the context
engine involved, use a skill to bring it
in. Um connect it to the MCP server and
and watch it do its thing. it. This is
where you get the biggest bang for buck.
It's also useful to do this during
review. So you get planning and review
at the end. Um because you know if if
you get an agent to do review, it's
basically just going to pay attention to
the code and try to understand where the
break points are um security concerns,
that kind of thing. But without the
organizational context, it doesn't
understand the motivation for it. So
that's the really important thing.
Pick enrichment. Um, this is a a super
cool use case. So, you create a ticket
for a new feature and then you just ask
the agent that's connected to a context
engine to fill in the blanks. Works.
Triage. Uh, I use this all the time.
When I see an issue in production, I
just whack it into an agent connect to
the context engine and it just like
instantly brings up all the past issues
related to this and um, starts operating
right away.
Increasingly we're seeing this one
incident management. Okay. So we we just
uh wired up data dog and this sorry
>> sentry and data dog sorry. Um and this
is already proving like super cool use
case. It uh it can see the signals and
then it can act on all the signals and
relate that to code uh relate it to past
incidents that you and discussions that
you've had in Slack. Having all those
things come together at once is is
almost like magical. And finally, I
think this one's actually my favorite
one and it's the one that customers use
the most is uh customer success and
sales and engineering support. So what
what a lot of big teams do is they have
engineering support channels where other
teams can come in and ask questions. If
you put a context engine into one of
these things, you can have it
automatically answer a lot of questions
and save engineers a ton of time.
All right. So, how teams make a context
engine their own skills. So, definitely
build uh skills that you can use to
curate context in a GitHub repo.
Um and you can build other skills around
it like typing ticket enrich give it the
issue ID and then it it can use the
context engine to build the enrichment.
uh workflows like this one prepare
prepare an incident timeline um and then
you can just send it off to your agent
again context engine blah blah blah
brings everything together magical
and this thing here um you can wire this
up to all kinds of agents I've got um uh
one one of the things that a lot of
customers like to do is wire this up to
claude code in their CI system um we
actually do have a code review component
so you don't have to do this if you're
using unblocked Um, but people use this
for other things, not just code review.
As soon as you wire up a context engine
in the background, give it an API key,
let it let it run on its own, it it can
do some pretty insane stuff. Um, so I'm
just going to show a quick
um
example of what wiring up a context
engine can do. So this is a PR that uh
my colleague wrote and uh it it
unblocked like went through and provided
a a kind of review to this thing and at
the bottom of this review here's the
review part. Um you can see that Richie
who was the author of this PR was like
very cool this is something I would say.
Uh now the reason for the comment which
was you've basically duplicated a bunch
of tests you can you can kind of dry
that up a little bit is because um this
was a best practice that was distilled
from a bunch of other PRs and the the
funny part is that the author of those
PRs was Richie. So he's the one that
actually instilled the best practice in
the organization. Uh so that was that
was just a cool little moment when we
discovered that. Um, here's another
example. So, this was a it's a fairly
long transcript. I'm not going to like
show the whole thing, but we we sent it
on a on a mission to do a big large
task. Without uh unblocked, it it uh
took quite a while. Like you can see
transcripts quite long. Um, and it it
missed a whole bunch of stuff. With
unblocked, uh, it was a lot more
compact. It it got to the answer like
very quickly and correctly. And just
because we're now AI forward and lazy,
um, we took both of those transcripts
and ran them into Claude and just said,
"Hey, Claude, why don't you just do a an
analysis of both these things and give
us your give us your result." Um, so it
it went through I won't, you know, bore
you with the details, but just to say
that at the end, the verdict is that uh
the context engine plan is is what I'd
ship with. This other one is good for a
prototype, but it's missing a whole
bunch of stuff that is important to this
organization. It was previously
discussed. Okay,
so this is essentially what what I've
been trying to say. Uh AI generated code
should just feel like it was written by
someone that's been in your team for
like 20 years. Okay. Um it doesn't if it
doesn't yet, that's fine. Um it will um
you're if you wire up unblocked you'll
you'll see like a a huge difference in
performance of agents and if you're
building one of these things absolutely
like take all these things and and build
and and let's see where that goes.
So just before we get into the workshop
component um maybe we'll just have like
five 10 minutes of Q&A.
>> I'm Brandon
>> and this is Brandon. So he'll he'll help
with
>> this.
>> Thanks. Um, so it's clear what it does
for you and what kind of problems it
solves? But to me, a big question mark
is what is the thing? What is the
artifact that that fits the bill? Is it
like a program you install, an API
that's hosted remotely, or an MCP
server? What is it?
>> It's it's all of those things. So, a a
context engine, I'll explain what
unblocked is. Maybe I can just show a
quick demo of it. Um, so broadly
speaking, there's a bunch of different
surfaces to a context engine. You want
to get it into your agent flow, and you
can do that with an MCP server. You can
do that with a CLI tool, for example.
Um, we also have this dashboard surface
where you can ask questions about your
code. Um, this is a pretty basic one,
but you can see it understands who I am
and what I've been working on. Um, and
then, uh, we have a Slack, we have Slack
connectivity as well. So, you can bring
unblocked into Slack. Um, drive it in
conversations and have it auto answer
things. Does that make sense? Did I
answer your question or
>> Okay.
>> Yes. API, CLI, MC.
>> Yeah.
Yeah,
>> sorry.
>> Thank you. Um, so my question is, so as
far as I understand is like a knowledge
management and retrieval um application.
>> Yeah. And does this relate somehow to
things like um LLM wiki like it was made
popular recently by Andre Karpati or the
decision traces and context graphs
>> which was discussed a lot a few months
ago.
>> Yeah. So you can think of all of those
things as kind of uh useful components
to a context engine. A context engine
has to do much more than that because um
so agents are really good at recursing
through a wiki for example. Depends on
how you build this wiki because there's
a bunch of things like organizational
memories, best practices, you know,
experts in your organization and that
are used as pivot points for context
retrieval. So a a wiki doesn't solve
those problems unless it has like a you
know you could build a structure with
it. And I think uh Carpathy discovered
that if you treat a wiki as um kind of
like a file system, you can break it
down and have the agent uh whack through
it like a file system. They're by the
way, agents are like highly optimized
for file system traversal.
>> Yeah. The compilation step. Exactly.
>> Yeah. Yeah, sorry.
Sorry, maybe the same question, but is
it is it a general purpose context
engine or is it targeted against uh code
because will it be useful as a say uh as
a business domain expert uh or sort of
building up a business domain and then
having this context engine use my so I
could all my other I agents could use
this as context for the business. uh or
would you say that is more like just for
the code part of it?
>> Uh so it it's definitely engineering
focused the the integrations are focused
on engineering activities. So you know
SCM integrations and other other tools
that engineers use um we are
increasingly seeing customers using it
for other purposes. So business
intelligence is a key thing. Uh and
that's usually useful when uh people in
in business functions are trying to get
an understanding of the product and its
function. Um we don't have uh like say
Salesforce integrations wired up for
that. So you couldn't use it to
understand um you know any anything
that's salesreated. It's it's really
primarily an engineering focused context
engine. That's not to say that that
won't change.
Yeah,
>> on the governance thing, if you're
>> um respecting access rights, how can it
do sort of synthesis across stuff and
then develop new knowledge inter
internally that it could then surface to
people?
>> So that yes, you're correct to point
that out. The the synthesis um is
compartmentalized.
So there are, you know, places that are
compartmentalized like individual
repositories. That's kind of the level
of access. So if you can synthesize uh
historical data based off of that um and
then correlate that with public Slack
information, then that's that's one way
to do synthesis without crossing the the
organizational boundaries. Um so
uh the you know the other way is to look
at and tag when um synthesized
information crosses those organizational
boundaries and you can take something
like a group ID approach to that problem
where you attach group ID tags to the
synthesized information and then only
retrieve it if the person that uh has
access to that can can build it out. So
first take the compartmentalized
approach because that's the where you'll
get the the most mileage and then you
kind of build up from there. I mean this
is the core problem with using a
technology like graph rag right because
graph rag is like a pyramid where it
builds up in layers and then basically
summarizes each layer but that like
unavoidably crosses uh permissions
boundaries. So you have to be you have
to create compartmentalized pockets.
Yeah.
It's a good question.
>> Yeah.
>> Yes. You've talked a lot about like all
the different sources of information
that you consume and putting them all
together. When it's like synthesizing
those down, is that still sort of like
naive rag, vector search, all that stuff
under the hood? Or is it like agents
deciding what is appropriate? like what
what or probably like combinations of
all of them, but what is that sort of
step?
>> Um yeah, you're right. It is a
combination of all of them. So knowledge
graph like knowledge graph buildup
happens in a bunch of different ways.
>> Um the the PR thing that I showed you
for example is like a first you build a
a naive knowledge graph procedurally and
then from there you can use an LLM to
distill down and summarize and build up
um those types of techniques. Um our
context engine builds first like a
knowledge graph from the base uh using
trying to leverage like all the
different entities. It's kind of like a
page rank thing where it builds up the
relationships procedurally and then of
course it vectorizes data. Um and then
there are procedural tools that fetch
data at runtime. Um a lot of the
distillation for uh you know conflict
resolution happens in two places. So one
is like during data ingestion time
there's there are tags that relate data
to each other so that we can see if we
can deconlict at that level and then
like rank against each other at that
level and then of course at runtime you
have to pass the things to a judge with
the criteria um and then it does
additional deconliction in real time.
>> Does that make sense?
>> Yeah.
>> Okay.
>> One more question. So uh I was curious
you said conflicts but at some point you
get conflicts that something means
revenue for one company and means
revenue for another company isn't a
totally different meaning how you can
recognize that so how do you get humans
in the loop how how do you use their
ontologies and how can you do you use it
when you run into it so I'm very curious
about that actually how how
>> yeah so if you I can show you just a
quick thing here so um you'll notice
that at the bottom the the references
that were used for answers are delivered
both like to the human in this interface
but also to the agent. So um if the
agent if the context engine isn't able
to do the deconliction then at this
point here the human can step in and
guide the agent when there are enough s
>> so you can you can literally just reply
and say like that's not correct or you
can come here
>> and
>> oh yeah sorry
>> yeah or you can you can do this like not
helpful and and give the reason why um
like it is a bit of a manual process at
this stage, but the signals that build
up over time,
>> it's funny, right?
You might have catch a lot of human
intelligence by this, right?
>> Yeah, that's amazing.
>> Yeah, for a typical customer, how much
do you have that metric?
>> Oh, it's huge. It's it's it's amazing.
Like I I was actually really surprised
by how willing people are to give
feedback. Um, yeah. No, it's
>> can you
thousands or hundreds
of
>> I mean at at small team size it's you
know in the hundreds at so small team
size being like 20 30 people at large
team size 100 to 200 people it's like
hundreds and hundreds of
>> oh wow
>> of feedback. Yeah
>> people just really like to interact with
agents and tell them in natural language
what's wrong. It's It's just a totally
natural thing to do.
>> Yeah.
>> Cool. All right. Um are we are we good
for Q&A
>> and then we can get on
>> ask questions as we hack. But
>> yeah, let's let's get on to the let's
get on to the workshop part of this. So
um we have created a um for actually
what I'll do is I'll just do this first.
So you can do this now if you'd like. Um
I will come back to this slide in a sec.
So the idea here is we're going to get
everyone to join a Slack workspace that
we created and then we're going to get
uh everyone into a repo where this um
where this sample code lives and then
we'll just start hacking away on it
together. Okay.
>> Yeah.
I got some people coming.
>> Nice.
>> When you drop in, you'll see an AI
engineering London channel. Hopefully
there's a link to
this drop
and
>> the unblocked link will not work until
you do step two. Yes.
>> To get into the GitHub or
>> Oh, no.
>> I've got I've got many people coming in.
So, I'm really hopeful that
>> is it network? Yeah.
>> We will find out.
>> Um, okay. While while folks are doing
that, I'm just going to show you what
we're getting into here. So, this is the
uh GitHub organization. Um, what we're
what we're working on is a social graph
builder. So, what this is going to do is
look at a source code repository. So,
you can run this on your own repo. It's
not going to upload anything. It's all
local. Um, so that you can see this
thing building up against your own
organization.
Um, and it's going to do a bunch of
things. We're going to get basically a
social graph out of it, and I'll show
you what that looks like. And we're
going to understand who the experts are
and which parts of the code they work
on. Um, and then there's going to be a
little like interactive visualization
thing. So, what what the goal of this
exercise is is to get this thing up and
running and start just start hacking
away on it. So, like start submitting
PRs as soon as uh we get this going. Um,
so this is what it looks like.
This graph here is our organization
unblocked. And uh what you're seeing
here is a a relationship graph that
shows who's reviewing whose PRs um and
who's who's getting reviewed.
Essentially
the uh this thing is
a distillation of all the different
teams within on blocks. So this is
roughly accurate actually. Well, not
roughly, it is pretty accurate. Um,
we've got I I did this all the way back
to the start of 25, 2025. When you run
the thing, I'd recommend maybe doing it
for a shorter timeline because it will
be a little bit slow uh if you go all
the way back to 25. Could take like 15
minutes. Um, but it's effectively
distilled who the teams are and um you
the only AI step in this is to label the
teams. You don't have to run the AI step
if you don't want to. it'll just use the
the parts of the code that people work
on the most. Um, this tab here will show
the experts in the organization and what
they work on. So, this is just broken
down by uh project area and path um and
shows like what areas of the code have
good coverage. coverage is defined
mostly by whether a a high contributing
organizational expert is present and
whether uh it's it's an actively
contributed to part of the code.
And then finally uh we'll have this
interactive graph that um breaks things
down by team area and we'll show like
you know who the major contributors are.
I'm over here on the AI team. Um yeah,
so that's it. Let's get everybody in and
we'll start hacking away at this.
>> Yes, absolutely.
>> Yeah,
>> many of you should have an invite who
have put your GitHub already in. So,
please give it a check. GitHub is the
worst.
>> Yeah,
>> we will. It is an MIT license. We will
be making it public later, but for now,
we needed it locked down.
Oh, did you still
I'll show you.
>> What have you done?
I also slight.
So, I think the the rest of this session
is going to be now just like hacking
away. So, um in a sec here, I think I'll
I'll take this this down if everyone's
got it. Um so, so that Brian and I can
concentrate on working with you guys to
build features.
Oh, when you submit PRs, by the way,
you'll notice that unblocked is sitting
there as a code reviewer. So, don't
don't feel badly if it uh sprays on your
PR a little bit.
>> Depends on how much
You got one heck of username. Good work.
>> Is Is everyone good with this? I take it
down. Okay, cool.
That is it.
>> Brandon, you're you're on top of the
invites. Okay, cool.
>> There's a few more. I'm on to
Chris
has given two.
>> That's okay. I'm gonna send both. Don't
worry.
>> That's just where I am in this list.
Oh,
forgot to mention a couple of things
here actually.
>> Yeah.
>> Coming back live. Yeah, good. Um, just a
couple of things. So if if uh you're
looking for something to implement and
starting with with any with coming up
with ideas and stuff, there is a uh a
set of sort of predefined issues that
you can hack away on. So you can just
grab one of these, whack it into Claude
and see how it does when it's connected
to the context engine.
Um, the MCP server for unblocked is
here. So, if you want instructions on
how to wire this up to uh claw code or
another agent, then you can grab it from
the instructions from here.
All right. So, I'm at Lars. There's two
more in here. So, I'm still going, by
the way, for those just adding
What's going on, Brandon?
>> Oh, sorry. Just one of the usernames is
>> Oh, okay.
You should have
just behind
Christopher, did you not get invited
yet?
>> No, I didn't.
>> That's weird.
>> Let me double check. You should have
one, but
>> yeah, I should. You should have an
email. I'm up to like one of you. So
hard.
>> It's like five clicks to add a member.
I'm like,
>> I was going to say should be able to use
the CLI for this. What's going on?
>> That's right.
>> No, they keep putting it in my PR and I
don't want it there. Copilot's going to
review for me
>> very poorly, but it will
question.
>> Yeah, for sure.
>> Do Hold on. Let me grab you the mic.
>> Hopefully that's on.
>> Does it work? Yes. Um, so I guess that
context engine works very well for
asynchronous agents so that you don't
need to specify things on your keyboard
because they can fetch what they need.
That's one of the main uses I I guess.
And
>> um so it plays very well uh I think with
agents like Copilot on GitHub. Do do you
see uh if you can share it uh which
agents are used most with unblocked
whether it's more because on the wild as
a developers with our laptops I think CL
code is much more used than copilot but
maybe you see a different picture.
>> Okay so I'm going to take this off the
screen for a secure
>> and try to see if I can pull that up for
you.
Um, but the answer is yes, we do know
roughly what that breakdown looks like.
So, let me grab that.
Okay.
I think this gives you kind of the rough
picture.
>> Okay. So, this is kind of the rough the
rough picture here. Um,
unfortunately because of the way that
this is I should probably like
>> extend the screen, but I'll just step
over here. So, uh, cloud code is by far
the most used. Um, followed this is the
the next one is cursor. So, that that
seems fairly obvious. This last one here
is kind of a catch-all, but what's
really interesting is that a lot of
people use cloud desktop, which which
was very unexpected, but this is the
case. Um, so, and then VS Code and
Codeex account for a much smaller
component, but yeah, it seems like
everyone's using either cursor clog
code. I would have expected more of, you
know, totally a synchronous agent, like
something that people would just run
from a PR. Okay, you can run code from a
PR, but it's less common. Maybe
sometimes you use copilot because it's
built in.
>> Yeah, actually this this one here, cloud
code, um may may capture some of that
traffic. So that that's probably what
you're seeing because people will wire
up cloud code in CI
>> and do things like that.
>> Thanks.
>> No problem.
I've got a potentially dumb question.
>> There's no dumb questions
>> this. Well, we'll see.
>> Actually, you know, you know,
>> you soon.
>> I I I had a teacher in grade three that
used to tell me, "There are no dumb
questions, only dumb people." Go on. I
could I could be one of them. Um, how
like from from your point of view,
right, you've got you you can use like
sub agents from like an exploratory
standpoint.
>> Yeah.
>> How how how does like that plus memory
plus just like storing snippets of
information that might be able I I'm
thinking of the like social graph that
you just showed, right?
>> Yeah.
>> Even in an organization that's like
several thousand people, you would be
able to store that in a very small file.
No.
>> Um you you would as the graph that you
showed.
>> Uh oh, I see the social graph component.
Yes. Yeah, it can be compact. I'm
>> trying to understand how this compares
like what's the kind of like USP
compared to the exploratory agents and
repeating that.
>> I I see what you're saying. Okay. Um
so there there are two there are two
components to that. One is that uh an
exploratory agent would have to do this
every time. So when it starts from
ground zero, yes, it might be possible
for it to reconstitute
a sort of social graph hierarchy, but it
would have to do two things in order to
do that. One is it would actually have
to write code in order to constitute the
the graph, at least the way that agents
are today or the way that the models are
today. You wouldn't be able to just have
it like run basic tools around um the
the organization and figure out the
who's who. um it would have to write
kind of like what that social graph
algorithm is, run it and then get the
distillation out the back end. So um at
that point you're basically getting
close to that component. Anyways, so
that you short circuit it and just run
it and use it. Um maybe I should explain
some of the motivation for that thing.
Actually I I realized now that I may not
have done that effectively. Um, social
graph is not just about conveying
information about who the experts are.
It's used within the context engine as a
pivot point um into more like important
context. So understanding who the
experts are in a particular code area
acts as a jump point because um another
part of a context engine which happens
at the ingestion and processing layer is
um distilling the um we call it bottling
the expert but it's essentially
distilling what that individual has
worked on in the past. uh where they sit
in the in the kind of hierarchy of the
organization um the the decisions that
they've made based on Slack
conversations that they've had based on
their PR comments all this kind of stuff
um when you distill that down it's and
you pass it to the agent then what
happens is like let's say that I'm a new
employee and I'm coming to work on a
particular area of code um there are a
bunch of different ways of loading
context for that code one is you know
semantic search via vector vector
search. Right? So that's kind of layer
one. Another layer is uh pre-built
memories. And then the the third layer
is bottling unbottling the expert for
that area of code. And getting that
expert's learnings into context is is a
really powerful mechanism. It helps
drive the rest of the retrieval in an
agentic loop and it helps um the agent
uh directionally like where to go next.
Does that make sense?
>> Right. I think everybody's in now.
>> Awesome.
>> So, let's uh
>> Okay. So, I think we're if we're all in
then
uh the next thing here is
once I get this back up on the screen.
>> I'm still I'm still sending invites. I
saw someone just So, please keep coming
and we can keep going.
>> Yep. So, um, feel free to basically just
fire this repo at your agent and get it
to like run it. If you if you literally
just say to Cloud Code, run this against
my repo, um, be sure to give it a time
range or a PR limit, otherwise it'll go
off the rails and take a really long
time to finish. So just say like process
the last like 300 PRs or process up till
you know September 2025 or something
like that. Um there's enough information
in the readme that it should be able to
just do it and just run it against your
repo.
>> I get cloned said read the read me and
make it happen.
>> Yeah.
>> Can I ask another what's your
What's your plans for the coming year or
something
>> for for unblocked
>> is it is it about unblocked or or about
this this sort of side project
>> this
>> um so I mean I've I've sort of alluded
to this before but like where the puck
is going is with fully autonomous agents
So we're very focused on making sure
that autonomous agent flows are highly
optimized. You, as I was saying at the
beginning of the conversation, you
cannot run those things effectively
without like um very finely tuned
context.
>> Yeah. So when you think about it
at some point
I read things like tracing what what do
agents and you get run books out of
those is that is that the path you're
you're investing in or what what is it
retrieval what what
>> are you talking specifically about
incident management then or
>> sorry
>> are you are you speaking specifically
about incident management management,
that sort of thing.
>> No, I'm I'm speaking about your I'm
thinking actually more from a business
perspective. How can we extract business
knowledge that's really deeply embedded
into systems nobody knows anymore and
some people know think they know but
they don't know.
>> Yeah.
>> Uh and documents, human knowledge,
right? Tested knowledge.
>> Yep. So, I mean there's there's two ways
of servicing that either at the product
level or um through the context engine
itself. And increasingly what we see is
that people leverage uh agents to do
their work even at that level. So
they'll they'll go to cloud code,
they'll connect the unblock context
engine, it'll be like do this thing for
me and then the context engine will find
all the things that it needs to do that
task and it it'll surface that data.
>> Yeah.
>> For us that means the first near-term
road map is API.
>> Yes. It's like CLI
>> CLI API
question
or it's just good that
Cool. I'm going to lift this off again.
Hopefully people start submitting some
PRs and we can
>> Yeah,
you're in that GitHub. Let me actually
repost it in the Slack channel
because that link will
>> so this this org will stay up until the
end of the week. Um at which point we'll
basically bring it down and um release
this uh as open source and uh everyone
that contributes obviously is going to
get credited. So um you your name will
be on it.
Should
we like
set up the repo locally and then start
doing what's basically? So I just
finished setting up
>> Yeah, just just clone the repo. Um you
the easiest thing to do is to take uh an
agent like Claude and point it at um
just launch it from that repo from that
directory and just say please uh
bootstrap and launch this this product
and away it will go.
If if you guys run into any kind of
technical things, we'll we're here
obviously. Yeah.
>> Let's hold on. Let's get you the the
mic. Oh, you got it. I've got a I've got
a lapel now. So
>> awesome. Cool.
>> Yeah.
>> Can you hear me? Yeah. Perfect. Um, so
on the on the slide where you had like
the performance and you guys were like
80% and without unblocked it was 20%.
>> Yeah.
>> Um, and now I see that well you are
basically hooking up like unblocked to
cloud cut. So in a way is it a fair
comparison to say
I will use vanilla cloud code with
access to the MCP and to the skills.
>> Yeah.
>> And then I will use clo codes hooked
with unblocked with the same MCPS and
the same skills.
>> Yeah.
>> And here you can do the performance
comparison. And here you still have a
lot of alpha from I I guess whatever you
are cooking inside unblocked. Is was it
the comparison that was done or was it
done without
>> was it done with a vanilla cloud code
but without context?
>> No, it was done with MCP servers like
GitHub and Slack wired up.
>> I see.
>> Yeah, cool.
>> We we basically got parody with all the
MCP servers of every SAS vendor in one.
It was like vanilla clawed all MCPS and
the other one was clawed with unblocked
only
>> and then do the task and
>> same context like the same context file
>> same same prompt
>> and same access. Yeah. Yeah.
>> It's it's pretty fun. Yeah. Oh, thank
you.
>> Um maybe two questions. So, one is uh I
see that like a lot of these like social
graphs are built with like the
traditional network uh kind of
calculation and statistical aspects of
networks. Um, is this like the approach
that you began with and it already
worked the best or uh did you like
because most of memory systems you work
more on like filtering out like episodic
memory something else something else
something else and this is like really
scoring really nice scoring system
>> uh that's first question is it like also
with the unblocked second question
>> um you mentioned that it works with
teams uh Microsoft environment I wonder
what the differences did you observe
between building social graphs for
different environments because on GitHub
I imagine it's very different than on
SharePoint teams etc etc is it also like
these network stats based or is it
something different
>> um so I mean our first implementation
was was incredibly naive right it was
just using uh the numbers of PR
contributions and comparing that
directly with uh the number of PRs
reviewed by each person so just a simple
like numbers game um with that that
didn't produce accurate team clusters.
So then we we got on to um the
algorithms that you see here. Um Unblock
does a little bit more than than this.
So this is kind of like a middle road.
Um another strategy that Unblock uses is
um like experts by by vector clusters.
So when we ingest the source code and
vectorize it um we understand like who
the the most contributors are for that
piece of source code. So when we look up
individuals, we can see what they've
been working on and what the um the
clusters in proximity are and then
relate people based on their their
cluster proximity. So that's more of
like an ML type approach. And then
there's a final layer which is um uh an
sort of AI LLM heavy layer that does
distillations of uh a whole bunch of
different context elements, things that
people have worked on in the past,
conversations that they've been having
in Slack. Um and then when when you take
all that and you weigh it against uh the
like procedurally generated graph, you
get a much more accurate distillation
there. This one here, you'll notice like
some some people will get pulled into
team clusters that you know are you know
operating across many different teams
for example and this won't account for
that
>> difference
algorithms different let's say that you
I don't want to take out
>> this so no this algorithm is like purely
SEM based. So the algorithms for you're
you're right like um Slack teams they're
quite a bit different because you don't
have these review points.
>> So then it becomes you know who's the
most active in particular channels and
then you need a distillation or a
summary of what that channel is about
and you need to vectorize that and then
you need to score it against the the
most frequent contributors. Um, but it's
not enough. You have to relate that back
to the SCM data in order to figure out
who the real experts are. One one of the
problems that I I've personally
experienced in some organizations I've
worked at is that you get like the noisy
junior engineer, right? So, they're
they're very noisy. They love to talk,
but the signal to noise ratio is not
great. And uh just because someone's not
saying a lot of things doesn't mean that
their messages are not impactful.
So part of this game is about assessing
the impact of uh when people say certain
things, you know, how does that relate
to the PRs that get spawned off as a
consequence? How many of those PRs get
merged? You know, that sort of thing.
>> Yeah.
>> Oh, is there not?
>> There should be. Okay, check that.
>> Well, you should be able to open a pull
request. You can't push to main.
>> Okay.
>> So, if that if that's the situ we But I
mean, we'll check.
>> Yeah, you should you should be able to
create a branch.
>> Oh, uh, no. No. Can he can't fork the
repo either.
>> Oh. Um, yeah, forks might be disabled.
>> This will be open source like at the end
of the week. Um, and your all your
contributions will be on it.
What's really fun is using that social
graph tool later against your own repo
and like showing your team.
>> Yeah.
>> Oh, sorry.
>> Oh, I'll
come.
I like that. Unblock tried to answer you
for that question.
>> Oh,
you see that the Slack auto response?
Sorry.
Are you in here?
That's a camera. Sorry.
>> Oh, it's okay. I was just
Oh, okay. Um, let me check to see. That
should not be the case.
Okay.
Let me know if you still need a GitHub
invite.
>> Just check the members. I think there
might Yeah, there might be an issue
here. Just a second.
Oh, these were direct assignments. So, I
think we have to like pull people into
the whole project because they're not
they're not org assigned.
>> Oh, GitHub, I love you.
>> 09s of uptime.
>> Yeah, we'll fix this one here. Yeah,
slam everybody in.
Come on.
>> You got it up. I'm trying to because now
we just need to add people.
>> Go to settings collaborators.
unblocked. You all have right access. It
is the name of the company.
>> Just just validate that for us if you
would.
>> Yeah, please let me know.
>> Perfect.
>> All right.
All right, we're getting real PRs now.
There we go. Nice. Nice.
>> Hell yeah.
Now let's do fun things.
Okay, nice. Looks good to me.
>> What?
>> I think we we have our our first
approved PR.
I'm send I'm just sending ridiculous
chats to unblocked so you can see it try
to answer questions in Slack as as PRs
come up.
I'm going to see what it says about
this.
>> Ask it to
>> It's like Oh, let me think about it.
>> Oh, did you ask it about the PR?
>> Yeah, but the PR I think you accepted.
So, we'll see what happens.
>> Yeah, I mean it did it did approve it.
So, you know, unblocked was
>> unblocked like this looks good to me,
man. blocked was down.
>> Only visible to you. Oh no.
>> What was such a good answer though?
>> Nice PR. Good job on block. Great
answer.
>> Yep. Oh,
yeah. Yeah. I'll put it back up. I'll
put it back up. One sec.
>> Uh, where did it go? Actually, I lost
the
over here.
>> Oh, yeah.
handle the sources or maybe something.
>> Do you want the app?
>> Yeah, for sure. Yeah.
>> Oh, yeah. Yeah.
>> Yeah. Of course. We were focused on you
building, but Yeah.
>> What am I supposed to do?
>> Oh, no. It's okay. I mean, let's go.
>> Oh. Oh,
>> I sorry. So this this uh this thing that
I showed before it it is the project
that exists in that repo.
>> So the
>> oh so the idea is like um think think
about features that you want to add or
things that you want to to fix or like
new components and then just hack away
at it and submit a PR.
>> Sorry.
>> Yeah, my bad.
>> Um do you want to open up like a
terminal session and show the MCP?
>> Oh, sure. Yeah, because I'm like people
can obviously use it but they don't have
all our source. Yeah.
>> Consultant
wanted to try to propose it to a client.
I cannot show the
context or maybe get an ide.
Well, I mean like one thing that you
could do um if you're visiting clients
is u you can ask them if they run the
tool on their
>> uh on their um repo and then it will
generate this result for them so they
can see on their own project what the
value is. Right.
>> I think Peter I think he's just asking
about our product specifically not this.
>> Oh unblocks. You're asking about
unblocked.
>> My bad man.
>> We're driving this way.
>> Sorry. Sorry, single track mind. Um,
okay. So your your question is how can
you demonstrate the value of unblock to
customers or
>> see the value?
Yeah,
>> sorry.
>> Yeah,
>> you can make conflicts emerge in in your
app, but um and then there is the
compliance layer which is very
interesting for corporate clients.
>> I was thinking how this um is translated
to a UX because you know
>> many people are known I understand it's
mainly for coding. Yeah.
>> And whether this is for technical people
or maybe you know people overseeing some
engineers or the engineer itself I mean
just see how your platform works. But if
it is is out of context I mean I it's
it's okay. I
>> no no that's that's totally fine. So
this this dashboard is kind of like the
um the sort of front-end customer
interface to the product. So you know
you come in here and you can ask any
question about your codebase or your or
your organization and get an answer for
it here. Um this is right now you know
attached to sorry I lost my cursor. This
is attached to um this test or that we
have but I could use it against
unblocked and I could say like you know
um I have a little hot thing here that I
can show.
Oops.
>> So, the source mark engine is an
internal component that we use to track
source code changes through time,
including like where
um you know, changes move between files
and so on. Um, so as a demonstration,
you know, you can show off, I mean, you
can book your your customers into a demo
with us and we can demonstrate this or
you can wire it up to your own
organization and demonstrate this flow
to customers um, and try to find, you
know, use cases where data sources
conflict and demonstrate that that the
challenge with context engines is that
it's really hard to demonstrate the
value to someone without actually wiring
it up. So there there is a little bit of
overhead there where people have to
connect it to all their integrations.
Now the good thing is um unblocked has a
free enterprise trial period so people
can try out the product in its fullest
form before um uh paying for it. Yeah.
>> So if some of that information is
incorrect, you can just reply in the
chatbot or flag it in the references.
>> Exactly. Yeah.
So you can just you can reply here or
you can say not helpful and explain why
and then uh it will distill it for the
next the next round.
>> So it will adjust some weights or
confidence scores internally.
>> Uh well internally what it does is it
constructs task memory.
>> So um it looks for those kind of
repeated signals and uh it this is
actually where the experts graph comes
in. It's used a lot. Um the experts
graph provides like weight. So when an
expert comes in and says that's not
correct, it's going to get some some
more weight and distill a memory for it.
Um if uh if it's just a new engineer
that says that's not right, then that's
not really a trustworthy source yet. So
uh you have to have a trustworthy source
to to base that on. Does that make
sense?
>> Yeah, it makes a lot of sense. It's like
social network.
>> Exactly. somehow.
>> Yeah. Yeah.
>> Thanks.
>> No problem.
>> Cool.
>> Oh, under the hood. Um, well, when it's
presented to the AI, it's presented as
as files. Um but under the hood we store
it in you know database tables and
stuff. Um like the memories are are are
constituted from a bunch of different
sources. So they're not just like flat
file based you know they'll be the whole
memory construct will be hydrated at
runtime. So
>> will you just give your tools to your
database based on whatever criteria
users?
>> Yeah. Well for Yeah. So yes, um there
are a bunch of tools for data retrieval.
For memory specifically, um you can't
really leave it up to the agent to do
memory hydration because that's kind of
like part of the seed context. In order
to get the agent to go in the right
direction, you have to seed it with the
appropriate data and experts context is
a good jump off point for the agent. So
yeah.
Yep.
Uh is there any official benchmark that
kind of track the type of value you try
to bring like um yeah because I feel
like it's not exactly coding or it is
but yeah I'm curious if there's any uh
public things that you're tracking
yourself against.
>> So we we we do have some internal
benchmarks. Um you're right it's a
little bit squishy.
>> Um so anthro have you have you heard
Boris Churnney talk um at cloud code?
It's like the creator of cloud code.
>> The creator of cloud code. Yeah. So he
um did this interview where they were
talking about like how they measure
success uh for cloud code internally.
This may have changed because there's a
lot of benchmarks now that they have
like they they have like the the
talk benchmark. You guys have probably
seen that one. Um but it but what that
really distills down to is vibes. And so
the most important thing in uh systems
like this is to capture sentiment. And
so if your sentiment is uh is trending
upwards then um that's a good thing. Our
our sentiment right now is uh on a scale
of minus 100 to 100 somewhere around 60
uh 60 score. So on a normalized scale
that's like 0.75
to 0.8. So, so the vibe would be
captured by something like maybe less
back and forth on the PRs or maybe um I
don't know you having less back and
forth with clothes to get your stuff
done.
>> Yeah. So the the vibes are like they're
they're people satisfied, right? So
satisfaction can come from a lot of
different sources and dissatisfaction
can come from a lot of different
sources. So the way to think about that
is that it encodes all of those things.
Um, but you can capture specific metrics
and we do how long things take and we're
actually currently working really hard
to bring the uh response times down
because um
uh you know even though agents are um
here's the interesting thing as we move
towards a more autonomous universe
response times for MCP servers are
actually less and less important. The
more important thing is that they get
the answer absolutely bang on.
>> Yeah.
>> And the reason is because um the the
amount of time that a context engine
spends collecting all that information
and distilling it is a microcosm of what
the full task takes to implement and to
and to traverse. So if you can spend a
little bit more time and cut the
implementation down by like 60 70 80%
that's a huge win, right? And go ahead.
>> Sorry, very small followup. Actually,
I'm curious. Do you have any uh rough uh
numbers on how much time does it spend
retrieving context versus executing the
task to your point? Like is it 10% right
now, 90%, or is it I have no idea. I
mean, I have my own experience.
>> It's like yeah um agent context
collection is probably close to that
number. It's like 90%. Um the actual
code writing part is really really fast.
If if you can even just watch what an
agent is doing. Um when it writes the
code that output tokens are by the way
the the thing that drags down um the the
performance. Everyone used to think it
was input tokens. We've run tons of
experiments with this. You can bring the
input token size up and you know time to
first output token now is is pretty
pretty good. Like it's pretty highly
optimized. The thing that really impacts
performance is output tokens. So um you
have to be like judicious with the way
that you collect and supply context back
to the agent uh so that it remains tight
on its output loops as well.
>> For um for one benchmark that Peter
mentioned in the talk we we gave an
ambitious task because obviously it's
prompt dependent how much time you're
adding and like with a context engine.
Um but the ambitious task we gave was to
implement the new adaptive thinking mode
in anthropics tool chain when they
introduced that which as mentioned it
went from a 25m minute wall clock time
to with with unblock with a context
engine. The other case without was 2 and
a half hours. It was 2 hours and 25
minutes. But the main reason for that
was we gave it all the data. We ran the
prompt and then its first output was
like totally wrong. So you had to the
human had to loop again and be like no
no no this this this and the next output
was wrong and the next output. So once
you do four loops you have like a two
and a half hour wall clock time versus
obviously the 25minut when it did not
meet that when there's no corrections
required.
>> Um so as mentioned it's think of it as a
waterfall. The more high quality correct
like high signal context you have up
front the better every single thing the
agent's going to do until it says it's
done whether it got it right or not.
Yeah.
>> Yeah.
>> It's got it.
>> You also mentioned that uh the token
usage on tool calls and like just
information search really decreased. So
I know that a lot of these tools that
provide uh or aggregators for tool use
they have insane like uh token usage. So
maybe have like some estimations on how
like let's say I need a slack
conversation some summary from one
conversation to another or like how
people interact there would be like 60k
tokens on composio I wonder how many
tokens it would be like using unblocked
>> yeah
lower we're still very vibes there like
it's hard to get real data from other
customer or people in the market um But
the again with that same I'm going to
keep talking to the same task as easy.
That one went from 21 million token
total usage to 10 million token with the
context engine. So a part of that though
is because you didn't have to doom loop.
>> So when when the of course like that
increased a lot of the tokens expense
like so we did drop it by 50% on a large
task. Again obviously if you're like yo
center a div you're not going to get a
lot of gain. It's like probably in the
training data. Um, but yeah, like any
feature uh fix like so a lot of like
again a lot of what people are putting
through unblocked are what an engineer
is doing every day. It's very rare that
you're doing a task that's like so I
don't know minor that like I mean then
again I've asked I've asked Claude to do
git push so I'm not the only one I bet I
was like you do it. It's like why did
that cost me 30 cents?
I don't know.
>> Yeah, I put I did all the effort to put
my GBG keys in the right place so I'm
like cloud
Go.
Any more questions while you all ship?
Any confusion? Anything I can unblock
for you? It's my purpose in life.
>> Sorry, you may have answered this
question already, but um so you're are
you using knowledge b knowledgebased rag
on in unblocked or what exactly is the
tech that you are surfacing?
>> Oh, so many things. Uh I can come talk
to you at the side. I'll take my mic
off. I'm just gonna answer that
question.
>> Sure.
>> That was just
>> this
talented
Yeah.
>> Oh, it's it's real time basically. So,
um there I guess there's there's two
parts to that question. one is like how
much or how frequently unblocked updates
the data on the back end. Um so it's
it's real time for many of the
integrations and then on a a cron job
for others because for those for those
particular integrations they don't have
web hooks basically.
>> Yeah. But the the dis so that means that
rebuilding the graph data has to happen
on a on a very frequent basis.
Yeah.
>> Yeah.
>> Yeah.
>> Yeah.
>> Yeah.
>> No, it's it's incremental.
So our our like you know social graph
builder algorithm has an incremental
component to it. So we don't have to
rerun the whole thing. Um but also uh
social graphs are less sensitive to
frequent changes in data because it's
unlikely that you know a single change
is going to make a huge impact on the
experts graph unless your organization
is brand new. So for
Yeah.
>> Yes. Yeah. So, as an example, um we do
best practices distillation on a much
lower cadence like basically uh week by
week because uh yeah, it just doesn't
change that much.
>> Yeah.
Um, well, the Oh, yeah.
Repeat your question. That's a good
question. So, I want to make sure we get
that one down.
Oh,
>> in in terms of customer privacy, data
retention
um kind of
>> yeah from from my point of view I'm
thinking of like enterprise SAS or even
like on premise type deployments which
I'm I'm not suggesting that you I'm just
thinking of that customer kind of
modality.
>> Um
>> yeah, do you get do you get push back?
Do you how do they feel about you
holding data? It's another processor in
the loop.
Um well so the the the privacy
discussions happen at the organizational
level. So it um uh we don't actually run
into a lot of friction. Um there are
definitely environments like in
government and at banks that have uh
super sensitive needs and so for those
needs we have an on-prem solution but
it's definitely not the path that I
would recommend like staying cloud-based
like we we have very large enterprise
organizations
uh that are entirely cloud-based, like
fully cloud-based. Um the you know the
the secret sauce is kind of like less
encoded in source code now and more
encoded in um uh the reasoning. So,
organizations tend to be a little bit
more sensitive around things like Slack
data for instance, but uh the way that
we store uh data like we have a whole
white paper about how we protect
customer data um and it's never been a
problem.
>> Yeah.
>> Pardon me. Can you run on prem?
>> Yes, we we do have an on-prem solution,
but as I say, like it's it's not the
recommended approach, but for sensitive
environments, for sure. Yeah.
>> Oh, why it's not recommended? Um, well,
the cloud-based integrations um,
you know, get updated more frequently
and so there's software patches. It's a
little bit harder to maintain within an
organization. Uh there's there's one
customer it's a bank um where
administering
uh the platform becomes quite difficult
because they have network isolation and
so like now one of us has to you know
sit within that network and administer
the platform or we have to train uh
individuals within the company to
administer the platform. So it's just
it's more of a a maintenance and um
handholding exercise.
But yeah.
>> Yeah, exactly. That's exactly right.
>> Yeah. Thank you. Thanks for coming.
very much.