Mergeable by default: Building the context engine to save time and tokens — Peter Werry, Unblocked
Channel: aiDotEngineer
Published at: 2026-05-03
YouTube video id: 5ID22ACI7IM
Source: https://www.youtube.com/watch?v=5ID22ACI7IM
All right, thanks everyone. Sorry about the wait. Um, this is going to be a a bit of a strange session because um there is a workshop component to this. So, uh I guess everyone will be coding on their laps. Sorry. Um but anyways, sorry. I'm Peter and uh this is my colleague Brandon. Um so we're we're going to break this this session into two different parts. One is a um sort of a talk that I'm going to give about what context engines are are useful for and and how you might go about building one, what to think about. Um and then we'll we'll launch into the u the second part of it. So um just briefly um quick quick agenda. We're going to talk about three myths that are circulating uh right now about about context engines and then I'll go over a couple of less or a few lessons uh that we learned along the way building one of these things. Um and then finally we'll do this. So we're going to build a social engineering graph. Uh this is a component that is super useful in a context engine. And first just a show of hands. Does everyone know what I mean by context engine or does anyone want clarification on that? >> Okay. So uh in the world of AI agents uh you have agents that uh when you start off and you start coding they are basically at ground zero. They have no context about your code, your organization, nothing. Okay. So typically what happens is the first thing they do is they start to rip around your codebase uh based on the task that you give them to try to gain some understanding uh sort of background understanding before they start to do their task. So context engineering is is kind of the art of supplying uh all the context that you need and most importantly none of the context that you don't need in a highly optimized way so that when the agent starts to run it executes the task uh in a streamlined way that's in line with your organization's best practices and expectations and so on. Okay, so we'll get to this after. So, not long ago, as in like four years ago or less, uh you were the context engine. Okay? So, when when your agent needed something, um you would prompt it, you'd grab the the issue ticket, you'd hand it all of the information that it needed to start its task. And in many cases, even when it was ripping around getting background contexts, when it got to the end of its task, sometimes it it got it wrong. In fact, many times it did. and you'd have to kind of like reset it. Um, reguide it towards the solution that you were thinking of. Um, or if it completely missed the mark, you'd have to be like, "No, not the not the JavaScript dummy. It's the Python source code that I want you to look at." Um, so let's just remember how you built context in an organization. Uh, so we're taking AI out of the picture for it for for just a sec. And I just want you to pretend that pre- AAI uh you just joined an organization, let's remember how we built it up. So over time, you would accumulate this kind of context through experience, right? You would start a job, maybe start code splunking a little bit to figure out um uh how things work. You'd maybe latch on to a mentor. Um, and eventually you'd you'd experience real things like incidents and outages and things like that. Those are sort of the the pain things that stick with you. Those are the battle scars, right? And that is what constitutes organizational context. It's the um it's the learnings along the way, the why did we do things the way we did it. And now you're good at your job because uh after all of that experience of pain, now you know what questions to ask. You know where to look when an incident happens. And this is the goal. This is what we want to get for our AI agents. So um I'm going to just lift this adoption curve from Vimath. And uh I I may have butchered his last name, but sorry uh Vim if you see this. Um so let's let's start at the beginning here. This was like four years ago in 2022. Everyone everyone remembers fancy autocomplete, right? Um so back in those days context windows and AI were pretty limited. I'm not sure if everyone even remembers this, but it was like 8 kilobytes or or 8k tokens I should say. And that's not a ton. And so tokens were highly optimized and um agentic idees like cursor focused just on the code that surrounded uh the code that you wanted to go in and autocomplete. So basically they took some code before they took some code after they put it into a model and they said this user is working on this piece of code what's the most likely next thing and that's what was printed out. Um it got progressively better than that. as were uh were integrated uh language servers and then it you you were able to basically pull like collers of source code and pull all that into context and then the LLMs were really good at at completing uh code. Um so at those levels you were the context engine and uh in in many in many circumstances here this is kind of where most people are here. They're at the uh uh parallel agents hooked up with MCP and skills. Okay. Just super curious, has anyone gone beyond curated context into the the last uh few degrees of of agentic freedom, shall we say, where you have background agents running in the cloud doing stuff in YOLO mode. Is anyone anyone experimenting with that? Okay, cool. That's that's very cool. That's bleeding edge. Um, but let's just take a moment to recognize that bleeding edge today is like yesterday's news in six months. Okay. So, the the puck I'm Canadian, so I'm going to say this, the puck is going down down the line towards background agents for sure. Um, and one of the things that we run into right now is this. Uh, we're becoming the bottleneck as humans, right? I'm not sure if if people have tried managing parallel agents and uh working on several tasks at once, but everyone's starting to feel this like cognitive disconnect because you're context switching all the time and it's just it's just really really painful. Um, it is very difficult to move from that mode where you're the human managing context into the background agents mode unless you have some kind of context engine that knows how your code operates, how your organization works and understands the motivations for historical changes and things like that. So, Andrew, Andre, he nailed it. Um, systems are intelligent. We're reaching the exponential on on intelligence for code pretty soon. Everyone's seen the the release about Mythos. Um even though we all haven't had a chance to really try it out yet. Uh the promise is that from a code intelligence perspective, this thing is like pretty much close to to perfect. Um but so now the bottleneck is context. Of course, without um without context, I'm just going to re-emphasize this point. you'll probably end up in doom loops. Does everyone know what a doom loop is? A doom loop is like when you're uh you're struggling with the agent. It's it's not quite doing what you want and you have to keep iterating on it. The worst case scenario is you run this thing in yolo mode and it finishes the entire task and it's completely wrong. You have to go back and correct, you know, various stages. Um so when you have a context engine, you can get there faster. The problem is that access doesn't equal understanding. So we have customers that are on various parts of I'm just going to go back to here. We have customers that are on various parts of this journey. Um and one of the one of the interesting things that we've noted is that uh people feel that you know they're they're they understand their organization best. So when it comes to feeding the right context to these agents, people will try to build some semblance of what a context engine actually is. They'll maybe build a rag system or they'll build some way to like feed organizational data to an agent. Um so unfortunately though, uh access doesn't mean understanding. So, what that means is you could just wire up a bunch of MCP servers. Um, and it's not going to be able to understand what the relationships are uh between all that data, how it was, how it got there, and how why it is the way it is. Um, and then there's another problem which I'll talk about a little bit later called satisfaction of search. So, just remember that term. I'll come back to it. Um, okay. So I just wanted to show you this. Um this was something that uh we actually implemented and we did it in in two parts. One was just without any context engine but wired up to a bunch of MCP servers. It did a pretty good job. Um but then when we reached the end um it it it missed the fact that we had some legacy stuff that depended on this old um method of um of intelligence size to to anthropics. So they have adaptive thinking now but it used to be you had to supply a a token budget and that's how like you could increase the size of the thinking window. Um, so we we had some code that kind of like depended on this and there were reasons for that um that the agent didn't understand or see and so it just basically clogged all that code. Um, but when we added the context engine then it saw all those reasons and implemented it the right way. So it made the appropriate changes in the right places, included backwards compatibility for the code that was using the old method. Okay, so now for the myths. Myth one, naive rag over my docs is a context engine. Um, so if you implement um say like vector search um or just a couple of search methods uh you're going to run into this you're going to run into a few issues. One is this satisfaction of search problem where uh the agent will search like crazy consume your tokens and then in the worst case you'll reach compaction. Okay. So, um without being able to find the the endgame, um there are a few different other techniques like you you need to have personalization when you build a retrieval system because if you just rag all your data, especially for very large organizations, there's going to be things like conflicts that you have to resolve in the data. Um it won't be focused on the task that you're trying to perform. it might pull in, you know, relevant code from other parts of your organization that especially if if you have a really big org and you've got tons of different repos. Um, it's it's just going to create a huge mess. So, you need to have some element of personalization. And then here again, connect a bunch of MCPs. I'm just going to reiterate this point. I'm done. Nope, definitely not. Um, so that that is the thing that that really um puts an emphasis on the satisfaction of search point. And I'll explain that in a sec. And finally, a bigger context window will solve this. Um, so way back, you know, when the models were starting to get big, people were really excited about a million tokens in your context window. Uh, the first models that tried this, I think it might have been Claude, actually. Was it Claude? >> I think it was >> or OpenAI. Okay. Okay. Gemini. >> Gemini. Yes. Sorry. I'm so sorry. Um, so yeah, Gemini first first model to try this and it was really good at finding needle in the hay stack. So you could feed like a huge document to it and as long as you you knew what you were looking for ahead of time, it could find it. Um, but it wasn't good at all at reasoning across different data sources, um, understanding the real meaning behind a problem and then recommending the appropriate solutions. So none of that was possible. Um, obviously things have things have gotten much better. Now, the problem is most organizations have more than a million tokens worth of context. So, trying to fit all that into the context window isn't going to work anyways. Let's project out to the future and imagine that you could fit like 10 million tokens, 50 million tokens. Um, at the current uh rate of memory consumption um just to operate the models, that's not going to be possible for a really long time. even if it was and you fit all that context in your context window, you're still going to run into problems with understanding what's true, what's false, um how to select the right information. Okay, so now I'm going to come back to this second point here, satisfaction of search. This is a a term that actually comes out of uh the medical field in radiology. And the idea is that um when techs are looking at x-rays uh and they're looking for uh the cause of of of symptoms, they might find something on the x-ray that explains those symptoms and then they stop. Um and that's that's kind of like a dangerous thing medically because uh there might be other indicators for things like cancer that get missed. So, uh, satisfaction of search is a is a real problem in radiology and there's lots of protocols to prevent just stopping as soon as you find the first thing. Um, this is what happens with agents when they search around in say uh, notion and your code uh, confluence, they'll stumble across what looks like the the thing they're looking for and they'll stop and then they'll they'll proceed. But the the real like golden nuggets of information might be in a different place that the agent wouldn't think to look like in a in a past Slack conversation or in an incident report, something like that. So here's the the classic iceberg meme. Um code that compiles. That's like the baseline. Does the agent produce code that compiles? Um but everything that that is actually important is happens underneath here. So understanding the user's original intent um what was rejected in the past by the team and tried before but failed. Uh how are you going to surface that kind of content just by looking at docs and and code and stuff? Um so you need to understand that somehow. Um and even worse like it it's it's sometimes hard to know uh when things were deleted like in the absence of information. So you need history as well leading up to decisions. So this is why we think you need a context engine. Uh a context engine understands who you are, what team you work on, who you work with, who the experts are in your organization, um and and what the decisions were that led up to the current iteration of your codebase. it's able to resolve conflicts. Uh so this is like a truth and false type situation. What's true, what's not. Um sometimes that truthiness is a gray area, right? So the context engine needs to also understand when to instruct the agent um that it wasn't able to resolve a conflict and then uh learn from additional user input. Um this third point is super important of course in any large organization or enterprise. Um there's often you know repositories that not everybody can access secret projects that sort of thing. So uh it's really important that you flow the access controls up. We have I'll give you an example that everyone will appreciate which is Slack. Um our contact engine integrates with Slack or Microsoft Teams. Um, and when you have uh private channels that that's really highly sensitive, right? Like you could be discussing HR information or uh maybe something that you just really don't want um everyone else to see. And so when when unblocked answers questions, uh it will use private channel information, but it won't it it will only use that information if the person that's asking the question has access to it. And then those answers are not public. Okay? So they're they're private to you. Um, and then finally, of course, delivering the right contest at the right time. And this is about token efficiency. It's about getting to the answer as quickly as possible. So, here's a kind of a highlevel overview of how how a context engine might work. Um, on the left we've got data source inputs. So, things like planning tools, docs, conversations, code, PRs, basically like anything that's relevant to getting work done at the engineering level. Um and then on the right side we have the outputs. So you know th this all can flow to coding agents CP or CLI tools. Um you can custom build apps through the API. We've got a we have a code review uh component that just plugs right into your SCM and provides code reviews and of course uh integrations with social messaging apps. So the these are the kind of like broad six requirements that we think are important. There's actually much more than this but these are the highle things. So again unified system contexts um this is about building relationships between data. Okay. But it's more than just um recognizing when uh one piece of data is related to another. Like for example, in in Slack, you might have conversations about PRs. That's an easy linkage because posting links back and forth. So that's easy. Um what's less easy is understanding uh the reason why decisions were made or your organization's best practices, right? So to understand that you have to go a little deeper. um do do things like distill um historical poll request comments on PRs and uh try to distill those down to the their core essence and then when you see repeated patterns uh you can pull those patterns together and store them as you know quote unquote memories so that when uh someone is working on a similar piece of code you can load those memories and then the agent can see that and go oh yeah right uh this is the way this organization does this particular thing. Um, conflict resolution super important. Um, we took a initially kind of a naive approach to this at first and based it just on recency, right? So, we we would bias towards newer stuff. Uh, unfortunately, when you have in the fullness of all your context, recency is not enough. Um, often you have people um, writing documents or chatting in in in their messaging platforms and they might be saying things that are not like completely aligned with uh, how the the system works. Um, uh, so you know then we started to bias towards code. So, we had recency and we're like the main branch is definitely your source of truth, but not always because sometimes um what's important is what happens next, not the way a system currently works. Like when when you're working on a task um what you really want is for the agent to understand where you're going, not necessarily where you've been. Where you've been helps it understand what not to do. where you're going helps it understand what you should do. Okay. So, in in the Slack case, looking at the conversations that your organization's experts are having is more important than just understanding what you know, every random engineer is talking about. Um, targeted retrieval and personal relevance are very related. So, I'll just talk about them uh together briefly. So, um, again, like when you're pulling context in, it's important that, uh, you're only pulling context in for the relevant task at hand and probably relevant to you. So, here's a technique that's kind of interesting. um you can understand what repos a person works on most by the number of PRs they submit contributions and then if you do a if you're doing vector retrieval you can do a uh deep retrieval on those focused repositories and then a wider retrieval on you know the rest of the source code and then sort of bias the the selection towards uh the focused repositories because that's more likely where someone's going to be working and spending their Um, and then you know, we've talked about data governance, so I don't think I need to go over that again. Super important though. This was just a a little experiment that we ran uh with a larger task. Um, I'm I fully admit that some of these numbers are a bit wonky. This is basically like Claude outputting numbers. So, don't don't trust it. Just trust the the vibe of the thing and not necessarily the numbers. Um, basically what it's saying is that when we started out uh without the MCP server act or sorry without the context engine active um it it really missed the mark on a lot of stuff. Uh and that's just because it didn't understand how um the existing implementation really worked and why it was the way it was, what was tried before and failed. Um and so it made a lot of those same mistakes. uh with a context engine turned turned on obviously it it um it nailed it. The the key numbers though are the the time and the tokens that it took. So without um the context engine took two and a half hours to finish this task with 21 million tokens which is a lot of tokens. Um but with the context engine it took only 25 minutes and 10 million tokens. So it's it's a pretty dramatic difference. Um okay so the hard lessons these are just samples by the way but the these are ones that we thought were kind of interesting. So first of all uh initially we optimized for access not understanding. So we our our first premise was if we just wire up a bunch of tools um and provide a a knowledge graph it will be able to traverse the knowledge graph and uh execute a bunch of retrieval specific tools for particular integrations and so on and figure everything out. Um that does not work. So uh you'll have to go a little bit deeper than that. Uh second one is we hid conflicts instead of surfacing them. So um by con by hiding conflicts I don't mean that we just ignored the conflicts. What we did instead was we tried to resolve those conflicts using those naive strategies and we didn't surface the conflicts that we weren't able to resolve. So this was a really good learning is that um a context engine I mean we'll get there eventually probably but uh it can't always tell uh what the truth elements are and when it can't you should surface that and learn from it. That's the key thing. And then finally I think a lot of folks tried this. This is a really bad idea. So when when a context engine supplies an answer um do not cache the answer and try to serve that same answer up again uh to a similar question. The reason is obvious is is fairly obvious in retrospect but um everything changes constantly right code changes docs change the reason for things change. So this just doesn't work. Um the other thing is if you try to uh use the the previous answers as context for new answers, you regress towards a mean. So if the model is like misbehaving or doing something bad and you continuously bring that into context, you're obviously going to pollute uh the context. And this is what happens. Okay. So let's now talk about where AI forward teams like like those that are doing this like cloud-based agent thing are are using and taking advantage of context engines. Um definitely and especially during the planning phase. Okay, this is where you get the biggest bang for buck unquestionably. Um get the context engine involved, use a skill to bring it in. Um connect it to the MCP server and and watch it do its thing. it. This is where you get the biggest bang for buck. It's also useful to do this during review. So you get planning and review at the end. Um because you know if if you get an agent to do review, it's basically just going to pay attention to the code and try to understand where the break points are um security concerns, that kind of thing. But without the organizational context, it doesn't understand the motivation for it. So that's the really important thing. Pick enrichment. Um, this is a a super cool use case. So, you create a ticket for a new feature and then you just ask the agent that's connected to a context engine to fill in the blanks. Works. Triage. Uh, I use this all the time. When I see an issue in production, I just whack it into an agent connect to the context engine and it just like instantly brings up all the past issues related to this and um, starts operating right away. Increasingly we're seeing this one incident management. Okay. So we we just uh wired up data dog and this sorry >> sentry and data dog sorry. Um and this is already proving like super cool use case. It uh it can see the signals and then it can act on all the signals and relate that to code uh relate it to past incidents that you and discussions that you've had in Slack. Having all those things come together at once is is almost like magical. And finally, I think this one's actually my favorite one and it's the one that customers use the most is uh customer success and sales and engineering support. So what what a lot of big teams do is they have engineering support channels where other teams can come in and ask questions. If you put a context engine into one of these things, you can have it automatically answer a lot of questions and save engineers a ton of time. All right. So, how teams make a context engine their own skills. So, definitely build uh skills that you can use to curate context in a GitHub repo. Um and you can build other skills around it like typing ticket enrich give it the issue ID and then it it can use the context engine to build the enrichment. uh workflows like this one prepare prepare an incident timeline um and then you can just send it off to your agent again context engine blah blah blah brings everything together magical and this thing here um you can wire this up to all kinds of agents I've got um uh one one of the things that a lot of customers like to do is wire this up to claude code in their CI system um we actually do have a code review component so you don't have to do this if you're using unblocked Um, but people use this for other things, not just code review. As soon as you wire up a context engine in the background, give it an API key, let it let it run on its own, it it can do some pretty insane stuff. Um, so I'm just going to show a quick um example of what wiring up a context engine can do. So this is a PR that uh my colleague wrote and uh it it unblocked like went through and provided a a kind of review to this thing and at the bottom of this review here's the review part. Um you can see that Richie who was the author of this PR was like very cool this is something I would say. Uh now the reason for the comment which was you've basically duplicated a bunch of tests you can you can kind of dry that up a little bit is because um this was a best practice that was distilled from a bunch of other PRs and the the funny part is that the author of those PRs was Richie. So he's the one that actually instilled the best practice in the organization. Uh so that was that was just a cool little moment when we discovered that. Um, here's another example. So, this was a it's a fairly long transcript. I'm not going to like show the whole thing, but we we sent it on a on a mission to do a big large task. Without uh unblocked, it it uh took quite a while. Like you can see transcripts quite long. Um, and it it missed a whole bunch of stuff. With unblocked, uh, it was a lot more compact. It it got to the answer like very quickly and correctly. And just because we're now AI forward and lazy, um, we took both of those transcripts and ran them into Claude and just said, "Hey, Claude, why don't you just do a an analysis of both these things and give us your give us your result." Um, so it it went through I won't, you know, bore you with the details, but just to say that at the end, the verdict is that uh the context engine plan is is what I'd ship with. This other one is good for a prototype, but it's missing a whole bunch of stuff that is important to this organization. It was previously discussed. Okay, so this is essentially what what I've been trying to say. Uh AI generated code should just feel like it was written by someone that's been in your team for like 20 years. Okay. Um it doesn't if it doesn't yet, that's fine. Um it will um you're if you wire up unblocked you'll you'll see like a a huge difference in performance of agents and if you're building one of these things absolutely like take all these things and and build and and let's see where that goes. So just before we get into the workshop component um maybe we'll just have like five 10 minutes of Q&A. >> I'm Brandon >> and this is Brandon. So he'll he'll help with >> this. >> Thanks. Um, so it's clear what it does for you and what kind of problems it solves? But to me, a big question mark is what is the thing? What is the artifact that that fits the bill? Is it like a program you install, an API that's hosted remotely, or an MCP server? What is it? >> It's it's all of those things. So, a a context engine, I'll explain what unblocked is. Maybe I can just show a quick demo of it. Um, so broadly speaking, there's a bunch of different surfaces to a context engine. You want to get it into your agent flow, and you can do that with an MCP server. You can do that with a CLI tool, for example. Um, we also have this dashboard surface where you can ask questions about your code. Um, this is a pretty basic one, but you can see it understands who I am and what I've been working on. Um, and then, uh, we have a Slack, we have Slack connectivity as well. So, you can bring unblocked into Slack. Um, drive it in conversations and have it auto answer things. Does that make sense? Did I answer your question or >> Okay. >> Yes. API, CLI, MC. >> Yeah. Yeah, >> sorry. >> Thank you. Um, so my question is, so as far as I understand is like a knowledge management and retrieval um application. >> Yeah. And does this relate somehow to things like um LLM wiki like it was made popular recently by Andre Karpati or the decision traces and context graphs >> which was discussed a lot a few months ago. >> Yeah. So you can think of all of those things as kind of uh useful components to a context engine. A context engine has to do much more than that because um so agents are really good at recursing through a wiki for example. Depends on how you build this wiki because there's a bunch of things like organizational memories, best practices, you know, experts in your organization and that are used as pivot points for context retrieval. So a a wiki doesn't solve those problems unless it has like a you know you could build a structure with it. And I think uh Carpathy discovered that if you treat a wiki as um kind of like a file system, you can break it down and have the agent uh whack through it like a file system. They're by the way, agents are like highly optimized for file system traversal. >> Yeah. The compilation step. Exactly. >> Yeah. Yeah, sorry. Sorry, maybe the same question, but is it is it a general purpose context engine or is it targeted against uh code because will it be useful as a say uh as a business domain expert uh or sort of building up a business domain and then having this context engine use my so I could all my other I agents could use this as context for the business. uh or would you say that is more like just for the code part of it? >> Uh so it it's definitely engineering focused the the integrations are focused on engineering activities. So you know SCM integrations and other other tools that engineers use um we are increasingly seeing customers using it for other purposes. So business intelligence is a key thing. Uh and that's usually useful when uh people in in business functions are trying to get an understanding of the product and its function. Um we don't have uh like say Salesforce integrations wired up for that. So you couldn't use it to understand um you know any anything that's salesreated. It's it's really primarily an engineering focused context engine. That's not to say that that won't change. Yeah, >> on the governance thing, if you're >> um respecting access rights, how can it do sort of synthesis across stuff and then develop new knowledge inter internally that it could then surface to people? >> So that yes, you're correct to point that out. The the synthesis um is compartmentalized. So there are, you know, places that are compartmentalized like individual repositories. That's kind of the level of access. So if you can synthesize uh historical data based off of that um and then correlate that with public Slack information, then that's that's one way to do synthesis without crossing the the organizational boundaries. Um so uh the you know the other way is to look at and tag when um synthesized information crosses those organizational boundaries and you can take something like a group ID approach to that problem where you attach group ID tags to the synthesized information and then only retrieve it if the person that uh has access to that can can build it out. So first take the compartmentalized approach because that's the where you'll get the the most mileage and then you kind of build up from there. I mean this is the core problem with using a technology like graph rag right because graph rag is like a pyramid where it builds up in layers and then basically summarizes each layer but that like unavoidably crosses uh permissions boundaries. So you have to be you have to create compartmentalized pockets. Yeah. It's a good question. >> Yeah. >> Yes. You've talked a lot about like all the different sources of information that you consume and putting them all together. When it's like synthesizing those down, is that still sort of like naive rag, vector search, all that stuff under the hood? Or is it like agents deciding what is appropriate? like what what or probably like combinations of all of them, but what is that sort of step? >> Um yeah, you're right. It is a combination of all of them. So knowledge graph like knowledge graph buildup happens in a bunch of different ways. >> Um the the PR thing that I showed you for example is like a first you build a a naive knowledge graph procedurally and then from there you can use an LLM to distill down and summarize and build up um those types of techniques. Um our context engine builds first like a knowledge graph from the base uh using trying to leverage like all the different entities. It's kind of like a page rank thing where it builds up the relationships procedurally and then of course it vectorizes data. Um and then there are procedural tools that fetch data at runtime. Um a lot of the distillation for uh you know conflict resolution happens in two places. So one is like during data ingestion time there's there are tags that relate data to each other so that we can see if we can deconlict at that level and then like rank against each other at that level and then of course at runtime you have to pass the things to a judge with the criteria um and then it does additional deconliction in real time. >> Does that make sense? >> Yeah. >> Okay. >> One more question. So uh I was curious you said conflicts but at some point you get conflicts that something means revenue for one company and means revenue for another company isn't a totally different meaning how you can recognize that so how do you get humans in the loop how how do you use their ontologies and how can you do you use it when you run into it so I'm very curious about that actually how how >> yeah so if you I can show you just a quick thing here so um you'll notice that at the bottom the the references that were used for answers are delivered both like to the human in this interface but also to the agent. So um if the agent if the context engine isn't able to do the deconliction then at this point here the human can step in and guide the agent when there are enough s >> so you can you can literally just reply and say like that's not correct or you can come here >> and >> oh yeah sorry >> yeah or you can you can do this like not helpful and and give the reason why um like it is a bit of a manual process at this stage, but the signals that build up over time, >> it's funny, right? You might have catch a lot of human intelligence by this, right? >> Yeah, that's amazing. >> Yeah, for a typical customer, how much do you have that metric? >> Oh, it's huge. It's it's it's amazing. Like I I was actually really surprised by how willing people are to give feedback. Um, yeah. No, it's >> can you thousands or hundreds of >> I mean at at small team size it's you know in the hundreds at so small team size being like 20 30 people at large team size 100 to 200 people it's like hundreds and hundreds of >> oh wow >> of feedback. Yeah >> people just really like to interact with agents and tell them in natural language what's wrong. It's It's just a totally natural thing to do. >> Yeah. >> Cool. All right. Um are we are we good for Q&A >> and then we can get on >> ask questions as we hack. But >> yeah, let's let's get on to the let's get on to the workshop part of this. So um we have created a um for actually what I'll do is I'll just do this first. So you can do this now if you'd like. Um I will come back to this slide in a sec. So the idea here is we're going to get everyone to join a Slack workspace that we created and then we're going to get uh everyone into a repo where this um where this sample code lives and then we'll just start hacking away on it together. Okay. >> Yeah. I got some people coming. >> Nice. >> When you drop in, you'll see an AI engineering London channel. Hopefully there's a link to this drop and >> the unblocked link will not work until you do step two. Yes. >> To get into the GitHub or >> Oh, no. >> I've got I've got many people coming in. So, I'm really hopeful that >> is it network? Yeah. >> We will find out. >> Um, okay. While while folks are doing that, I'm just going to show you what we're getting into here. So, this is the uh GitHub organization. Um, what we're what we're working on is a social graph builder. So, what this is going to do is look at a source code repository. So, you can run this on your own repo. It's not going to upload anything. It's all local. Um, so that you can see this thing building up against your own organization. Um, and it's going to do a bunch of things. We're going to get basically a social graph out of it, and I'll show you what that looks like. And we're going to understand who the experts are and which parts of the code they work on. Um, and then there's going to be a little like interactive visualization thing. So, what what the goal of this exercise is is to get this thing up and running and start just start hacking away on it. So, like start submitting PRs as soon as uh we get this going. Um, so this is what it looks like. This graph here is our organization unblocked. And uh what you're seeing here is a a relationship graph that shows who's reviewing whose PRs um and who's who's getting reviewed. Essentially the uh this thing is a distillation of all the different teams within on blocks. So this is roughly accurate actually. Well, not roughly, it is pretty accurate. Um, we've got I I did this all the way back to the start of 25, 2025. When you run the thing, I'd recommend maybe doing it for a shorter timeline because it will be a little bit slow uh if you go all the way back to 25. Could take like 15 minutes. Um, but it's effectively distilled who the teams are and um you the only AI step in this is to label the teams. You don't have to run the AI step if you don't want to. it'll just use the the parts of the code that people work on the most. Um, this tab here will show the experts in the organization and what they work on. So, this is just broken down by uh project area and path um and shows like what areas of the code have good coverage. coverage is defined mostly by whether a a high contributing organizational expert is present and whether uh it's it's an actively contributed to part of the code. And then finally uh we'll have this interactive graph that um breaks things down by team area and we'll show like you know who the major contributors are. I'm over here on the AI team. Um yeah, so that's it. Let's get everybody in and we'll start hacking away at this. >> Yes, absolutely. >> Yeah, >> many of you should have an invite who have put your GitHub already in. So, please give it a check. GitHub is the worst. >> Yeah, >> we will. It is an MIT license. We will be making it public later, but for now, we needed it locked down. Oh, did you still I'll show you. >> What have you done? I also slight. So, I think the the rest of this session is going to be now just like hacking away. So, um in a sec here, I think I'll I'll take this this down if everyone's got it. Um so, so that Brian and I can concentrate on working with you guys to build features. Oh, when you submit PRs, by the way, you'll notice that unblocked is sitting there as a code reviewer. So, don't don't feel badly if it uh sprays on your PR a little bit. >> Depends on how much You got one heck of username. Good work. >> Is Is everyone good with this? I take it down. Okay, cool. That is it. >> Brandon, you're you're on top of the invites. Okay, cool. >> There's a few more. I'm on to Chris has given two. >> That's okay. I'm gonna send both. Don't worry. >> That's just where I am in this list. Oh, forgot to mention a couple of things here actually. >> Yeah. >> Coming back live. Yeah, good. Um, just a couple of things. So if if uh you're looking for something to implement and starting with with any with coming up with ideas and stuff, there is a uh a set of sort of predefined issues that you can hack away on. So you can just grab one of these, whack it into Claude and see how it does when it's connected to the context engine. Um, the MCP server for unblocked is here. So, if you want instructions on how to wire this up to uh claw code or another agent, then you can grab it from the instructions from here. All right. So, I'm at Lars. There's two more in here. So, I'm still going, by the way, for those just adding What's going on, Brandon? >> Oh, sorry. Just one of the usernames is >> Oh, okay. You should have just behind Christopher, did you not get invited yet? >> No, I didn't. >> That's weird. >> Let me double check. You should have one, but >> yeah, I should. You should have an email. I'm up to like one of you. So hard. >> It's like five clicks to add a member. I'm like, >> I was going to say should be able to use the CLI for this. What's going on? >> That's right. >> No, they keep putting it in my PR and I don't want it there. Copilot's going to review for me >> very poorly, but it will question. >> Yeah, for sure. >> Do Hold on. Let me grab you the mic. >> Hopefully that's on. >> Does it work? Yes. Um, so I guess that context engine works very well for asynchronous agents so that you don't need to specify things on your keyboard because they can fetch what they need. That's one of the main uses I I guess. And >> um so it plays very well uh I think with agents like Copilot on GitHub. Do do you see uh if you can share it uh which agents are used most with unblocked whether it's more because on the wild as a developers with our laptops I think CL code is much more used than copilot but maybe you see a different picture. >> Okay so I'm going to take this off the screen for a secure >> and try to see if I can pull that up for you. Um, but the answer is yes, we do know roughly what that breakdown looks like. So, let me grab that. Okay. I think this gives you kind of the rough picture. >> Okay. So, this is kind of the rough the rough picture here. Um, unfortunately because of the way that this is I should probably like >> extend the screen, but I'll just step over here. So, uh, cloud code is by far the most used. Um, followed this is the the next one is cursor. So, that that seems fairly obvious. This last one here is kind of a catch-all, but what's really interesting is that a lot of people use cloud desktop, which which was very unexpected, but this is the case. Um, so, and then VS Code and Codeex account for a much smaller component, but yeah, it seems like everyone's using either cursor clog code. I would have expected more of, you know, totally a synchronous agent, like something that people would just run from a PR. Okay, you can run code from a PR, but it's less common. Maybe sometimes you use copilot because it's built in. >> Yeah, actually this this one here, cloud code, um may may capture some of that traffic. So that that's probably what you're seeing because people will wire up cloud code in CI >> and do things like that. >> Thanks. >> No problem. I've got a potentially dumb question. >> There's no dumb questions >> this. Well, we'll see. >> Actually, you know, you know, >> you soon. >> I I I had a teacher in grade three that used to tell me, "There are no dumb questions, only dumb people." Go on. I could I could be one of them. Um, how like from from your point of view, right, you've got you you can use like sub agents from like an exploratory standpoint. >> Yeah. >> How how how does like that plus memory plus just like storing snippets of information that might be able I I'm thinking of the like social graph that you just showed, right? >> Yeah. >> Even in an organization that's like several thousand people, you would be able to store that in a very small file. No. >> Um you you would as the graph that you showed. >> Uh oh, I see the social graph component. Yes. Yeah, it can be compact. I'm >> trying to understand how this compares like what's the kind of like USP compared to the exploratory agents and repeating that. >> I I see what you're saying. Okay. Um so there there are two there are two components to that. One is that uh an exploratory agent would have to do this every time. So when it starts from ground zero, yes, it might be possible for it to reconstitute a sort of social graph hierarchy, but it would have to do two things in order to do that. One is it would actually have to write code in order to constitute the the graph, at least the way that agents are today or the way that the models are today. You wouldn't be able to just have it like run basic tools around um the the organization and figure out the who's who. um it would have to write kind of like what that social graph algorithm is, run it and then get the distillation out the back end. So um at that point you're basically getting close to that component. Anyways, so that you short circuit it and just run it and use it. Um maybe I should explain some of the motivation for that thing. Actually I I realized now that I may not have done that effectively. Um, social graph is not just about conveying information about who the experts are. It's used within the context engine as a pivot point um into more like important context. So understanding who the experts are in a particular code area acts as a jump point because um another part of a context engine which happens at the ingestion and processing layer is um distilling the um we call it bottling the expert but it's essentially distilling what that individual has worked on in the past. uh where they sit in the in the kind of hierarchy of the organization um the the decisions that they've made based on Slack conversations that they've had based on their PR comments all this kind of stuff um when you distill that down it's and you pass it to the agent then what happens is like let's say that I'm a new employee and I'm coming to work on a particular area of code um there are a bunch of different ways of loading context for that code one is you know semantic search via vector vector search. Right? So that's kind of layer one. Another layer is uh pre-built memories. And then the the third layer is bottling unbottling the expert for that area of code. And getting that expert's learnings into context is is a really powerful mechanism. It helps drive the rest of the retrieval in an agentic loop and it helps um the agent uh directionally like where to go next. Does that make sense? >> Right. I think everybody's in now. >> Awesome. >> So, let's uh >> Okay. So, I think we're if we're all in then uh the next thing here is once I get this back up on the screen. >> I'm still I'm still sending invites. I saw someone just So, please keep coming and we can keep going. >> Yep. So, um, feel free to basically just fire this repo at your agent and get it to like run it. If you if you literally just say to Cloud Code, run this against my repo, um, be sure to give it a time range or a PR limit, otherwise it'll go off the rails and take a really long time to finish. So just say like process the last like 300 PRs or process up till you know September 2025 or something like that. Um there's enough information in the readme that it should be able to just do it and just run it against your repo. >> I get cloned said read the read me and make it happen. >> Yeah. >> Can I ask another what's your What's your plans for the coming year or something >> for for unblocked >> is it is it about unblocked or or about this this sort of side project >> this >> um so I mean I've I've sort of alluded to this before but like where the puck is going is with fully autonomous agents So we're very focused on making sure that autonomous agent flows are highly optimized. You, as I was saying at the beginning of the conversation, you cannot run those things effectively without like um very finely tuned context. >> Yeah. So when you think about it at some point I read things like tracing what what do agents and you get run books out of those is that is that the path you're you're investing in or what what is it retrieval what what >> are you talking specifically about incident management then or >> sorry >> are you are you speaking specifically about incident management management, that sort of thing. >> No, I'm I'm speaking about your I'm thinking actually more from a business perspective. How can we extract business knowledge that's really deeply embedded into systems nobody knows anymore and some people know think they know but they don't know. >> Yeah. >> Uh and documents, human knowledge, right? Tested knowledge. >> Yep. So, I mean there's there's two ways of servicing that either at the product level or um through the context engine itself. And increasingly what we see is that people leverage uh agents to do their work even at that level. So they'll they'll go to cloud code, they'll connect the unblock context engine, it'll be like do this thing for me and then the context engine will find all the things that it needs to do that task and it it'll surface that data. >> Yeah. >> For us that means the first near-term road map is API. >> Yes. It's like CLI >> CLI API question or it's just good that Cool. I'm going to lift this off again. Hopefully people start submitting some PRs and we can >> Yeah, you're in that GitHub. Let me actually repost it in the Slack channel because that link will >> so this this org will stay up until the end of the week. Um at which point we'll basically bring it down and um release this uh as open source and uh everyone that contributes obviously is going to get credited. So um you your name will be on it. Should we like set up the repo locally and then start doing what's basically? So I just finished setting up >> Yeah, just just clone the repo. Um you the easiest thing to do is to take uh an agent like Claude and point it at um just launch it from that repo from that directory and just say please uh bootstrap and launch this this product and away it will go. If if you guys run into any kind of technical things, we'll we're here obviously. Yeah. >> Let's hold on. Let's get you the the mic. Oh, you got it. I've got a I've got a lapel now. So >> awesome. Cool. >> Yeah. >> Can you hear me? Yeah. Perfect. Um, so on the on the slide where you had like the performance and you guys were like 80% and without unblocked it was 20%. >> Yeah. >> Um, and now I see that well you are basically hooking up like unblocked to cloud cut. So in a way is it a fair comparison to say I will use vanilla cloud code with access to the MCP and to the skills. >> Yeah. >> And then I will use clo codes hooked with unblocked with the same MCPS and the same skills. >> Yeah. >> And here you can do the performance comparison. And here you still have a lot of alpha from I I guess whatever you are cooking inside unblocked. Is was it the comparison that was done or was it done without >> was it done with a vanilla cloud code but without context? >> No, it was done with MCP servers like GitHub and Slack wired up. >> I see. >> Yeah, cool. >> We we basically got parody with all the MCP servers of every SAS vendor in one. It was like vanilla clawed all MCPS and the other one was clawed with unblocked only >> and then do the task and >> same context like the same context file >> same same prompt >> and same access. Yeah. Yeah. >> It's it's pretty fun. Yeah. Oh, thank you. >> Um maybe two questions. So, one is uh I see that like a lot of these like social graphs are built with like the traditional network uh kind of calculation and statistical aspects of networks. Um, is this like the approach that you began with and it already worked the best or uh did you like because most of memory systems you work more on like filtering out like episodic memory something else something else something else and this is like really scoring really nice scoring system >> uh that's first question is it like also with the unblocked second question >> um you mentioned that it works with teams uh Microsoft environment I wonder what the differences did you observe between building social graphs for different environments because on GitHub I imagine it's very different than on SharePoint teams etc etc is it also like these network stats based or is it something different >> um so I mean our first implementation was was incredibly naive right it was just using uh the numbers of PR contributions and comparing that directly with uh the number of PRs reviewed by each person so just a simple like numbers game um with that that didn't produce accurate team clusters. So then we we got on to um the algorithms that you see here. Um Unblock does a little bit more than than this. So this is kind of like a middle road. Um another strategy that Unblock uses is um like experts by by vector clusters. So when we ingest the source code and vectorize it um we understand like who the the most contributors are for that piece of source code. So when we look up individuals, we can see what they've been working on and what the um the clusters in proximity are and then relate people based on their their cluster proximity. So that's more of like an ML type approach. And then there's a final layer which is um uh an sort of AI LLM heavy layer that does distillations of uh a whole bunch of different context elements, things that people have worked on in the past, conversations that they've been having in Slack. Um and then when when you take all that and you weigh it against uh the like procedurally generated graph, you get a much more accurate distillation there. This one here, you'll notice like some some people will get pulled into team clusters that you know are you know operating across many different teams for example and this won't account for that >> difference algorithms different let's say that you I don't want to take out >> this so no this algorithm is like purely SEM based. So the algorithms for you're you're right like um Slack teams they're quite a bit different because you don't have these review points. >> So then it becomes you know who's the most active in particular channels and then you need a distillation or a summary of what that channel is about and you need to vectorize that and then you need to score it against the the most frequent contributors. Um, but it's not enough. You have to relate that back to the SCM data in order to figure out who the real experts are. One one of the problems that I I've personally experienced in some organizations I've worked at is that you get like the noisy junior engineer, right? So, they're they're very noisy. They love to talk, but the signal to noise ratio is not great. And uh just because someone's not saying a lot of things doesn't mean that their messages are not impactful. So part of this game is about assessing the impact of uh when people say certain things, you know, how does that relate to the PRs that get spawned off as a consequence? How many of those PRs get merged? You know, that sort of thing. >> Yeah. >> Oh, is there not? >> There should be. Okay, check that. >> Well, you should be able to open a pull request. You can't push to main. >> Okay. >> So, if that if that's the situ we But I mean, we'll check. >> Yeah, you should you should be able to create a branch. >> Oh, uh, no. No. Can he can't fork the repo either. >> Oh. Um, yeah, forks might be disabled. >> This will be open source like at the end of the week. Um, and your all your contributions will be on it. What's really fun is using that social graph tool later against your own repo and like showing your team. >> Yeah. >> Oh, sorry. >> Oh, I'll come. I like that. Unblock tried to answer you for that question. >> Oh, you see that the Slack auto response? Sorry. Are you in here? That's a camera. Sorry. >> Oh, it's okay. I was just Oh, okay. Um, let me check to see. That should not be the case. Okay. Let me know if you still need a GitHub invite. >> Just check the members. I think there might Yeah, there might be an issue here. Just a second. Oh, these were direct assignments. So, I think we have to like pull people into the whole project because they're not they're not org assigned. >> Oh, GitHub, I love you. >> 09s of uptime. >> Yeah, we'll fix this one here. Yeah, slam everybody in. Come on. >> You got it up. I'm trying to because now we just need to add people. >> Go to settings collaborators. unblocked. You all have right access. It is the name of the company. >> Just just validate that for us if you would. >> Yeah, please let me know. >> Perfect. >> All right. All right, we're getting real PRs now. There we go. Nice. Nice. >> Hell yeah. Now let's do fun things. Okay, nice. Looks good to me. >> What? >> I think we we have our our first approved PR. I'm send I'm just sending ridiculous chats to unblocked so you can see it try to answer questions in Slack as as PRs come up. I'm going to see what it says about this. >> Ask it to >> It's like Oh, let me think about it. >> Oh, did you ask it about the PR? >> Yeah, but the PR I think you accepted. So, we'll see what happens. >> Yeah, I mean it did it did approve it. So, you know, unblocked was >> unblocked like this looks good to me, man. blocked was down. >> Only visible to you. Oh no. >> What was such a good answer though? >> Nice PR. Good job on block. Great answer. >> Yep. Oh, yeah. Yeah. I'll put it back up. I'll put it back up. One sec. >> Uh, where did it go? Actually, I lost the over here. >> Oh, yeah. handle the sources or maybe something. >> Do you want the app? >> Yeah, for sure. Yeah. >> Oh, yeah. Yeah. >> Yeah. Of course. We were focused on you building, but Yeah. >> What am I supposed to do? >> Oh, no. It's okay. I mean, let's go. >> Oh. Oh, >> I sorry. So this this uh this thing that I showed before it it is the project that exists in that repo. >> So the >> oh so the idea is like um think think about features that you want to add or things that you want to to fix or like new components and then just hack away at it and submit a PR. >> Sorry. >> Yeah, my bad. >> Um do you want to open up like a terminal session and show the MCP? >> Oh, sure. Yeah, because I'm like people can obviously use it but they don't have all our source. Yeah. >> Consultant wanted to try to propose it to a client. I cannot show the context or maybe get an ide. Well, I mean like one thing that you could do um if you're visiting clients is u you can ask them if they run the tool on their >> uh on their um repo and then it will generate this result for them so they can see on their own project what the value is. Right. >> I think Peter I think he's just asking about our product specifically not this. >> Oh unblocks. You're asking about unblocked. >> My bad man. >> We're driving this way. >> Sorry. Sorry, single track mind. Um, okay. So your your question is how can you demonstrate the value of unblock to customers or >> see the value? Yeah, >> sorry. >> Yeah, >> you can make conflicts emerge in in your app, but um and then there is the compliance layer which is very interesting for corporate clients. >> I was thinking how this um is translated to a UX because you know >> many people are known I understand it's mainly for coding. Yeah. >> And whether this is for technical people or maybe you know people overseeing some engineers or the engineer itself I mean just see how your platform works. But if it is is out of context I mean I it's it's okay. I >> no no that's that's totally fine. So this this dashboard is kind of like the um the sort of front-end customer interface to the product. So you know you come in here and you can ask any question about your codebase or your or your organization and get an answer for it here. Um this is right now you know attached to sorry I lost my cursor. This is attached to um this test or that we have but I could use it against unblocked and I could say like you know um I have a little hot thing here that I can show. Oops. >> So, the source mark engine is an internal component that we use to track source code changes through time, including like where um you know, changes move between files and so on. Um, so as a demonstration, you know, you can show off, I mean, you can book your your customers into a demo with us and we can demonstrate this or you can wire it up to your own organization and demonstrate this flow to customers um, and try to find, you know, use cases where data sources conflict and demonstrate that that the challenge with context engines is that it's really hard to demonstrate the value to someone without actually wiring it up. So there there is a little bit of overhead there where people have to connect it to all their integrations. Now the good thing is um unblocked has a free enterprise trial period so people can try out the product in its fullest form before um uh paying for it. Yeah. >> So if some of that information is incorrect, you can just reply in the chatbot or flag it in the references. >> Exactly. Yeah. So you can just you can reply here or you can say not helpful and explain why and then uh it will distill it for the next the next round. >> So it will adjust some weights or confidence scores internally. >> Uh well internally what it does is it constructs task memory. >> So um it looks for those kind of repeated signals and uh it this is actually where the experts graph comes in. It's used a lot. Um the experts graph provides like weight. So when an expert comes in and says that's not correct, it's going to get some some more weight and distill a memory for it. Um if uh if it's just a new engineer that says that's not right, then that's not really a trustworthy source yet. So uh you have to have a trustworthy source to to base that on. Does that make sense? >> Yeah, it makes a lot of sense. It's like social network. >> Exactly. somehow. >> Yeah. Yeah. >> Thanks. >> No problem. >> Cool. >> Oh, under the hood. Um, well, when it's presented to the AI, it's presented as as files. Um but under the hood we store it in you know database tables and stuff. Um like the memories are are are constituted from a bunch of different sources. So they're not just like flat file based you know they'll be the whole memory construct will be hydrated at runtime. So >> will you just give your tools to your database based on whatever criteria users? >> Yeah. Well for Yeah. So yes, um there are a bunch of tools for data retrieval. For memory specifically, um you can't really leave it up to the agent to do memory hydration because that's kind of like part of the seed context. In order to get the agent to go in the right direction, you have to seed it with the appropriate data and experts context is a good jump off point for the agent. So yeah. Yep. Uh is there any official benchmark that kind of track the type of value you try to bring like um yeah because I feel like it's not exactly coding or it is but yeah I'm curious if there's any uh public things that you're tracking yourself against. >> So we we we do have some internal benchmarks. Um you're right it's a little bit squishy. >> Um so anthro have you have you heard Boris Churnney talk um at cloud code? It's like the creator of cloud code. >> The creator of cloud code. Yeah. So he um did this interview where they were talking about like how they measure success uh for cloud code internally. This may have changed because there's a lot of benchmarks now that they have like they they have like the the talk benchmark. You guys have probably seen that one. Um but it but what that really distills down to is vibes. And so the most important thing in uh systems like this is to capture sentiment. And so if your sentiment is uh is trending upwards then um that's a good thing. Our our sentiment right now is uh on a scale of minus 100 to 100 somewhere around 60 uh 60 score. So on a normalized scale that's like 0.75 to 0.8. So, so the vibe would be captured by something like maybe less back and forth on the PRs or maybe um I don't know you having less back and forth with clothes to get your stuff done. >> Yeah. So the the vibes are like they're they're people satisfied, right? So satisfaction can come from a lot of different sources and dissatisfaction can come from a lot of different sources. So the way to think about that is that it encodes all of those things. Um, but you can capture specific metrics and we do how long things take and we're actually currently working really hard to bring the uh response times down because um uh you know even though agents are um here's the interesting thing as we move towards a more autonomous universe response times for MCP servers are actually less and less important. The more important thing is that they get the answer absolutely bang on. >> Yeah. >> And the reason is because um the the amount of time that a context engine spends collecting all that information and distilling it is a microcosm of what the full task takes to implement and to and to traverse. So if you can spend a little bit more time and cut the implementation down by like 60 70 80% that's a huge win, right? And go ahead. >> Sorry, very small followup. Actually, I'm curious. Do you have any uh rough uh numbers on how much time does it spend retrieving context versus executing the task to your point? Like is it 10% right now, 90%, or is it I have no idea. I mean, I have my own experience. >> It's like yeah um agent context collection is probably close to that number. It's like 90%. Um the actual code writing part is really really fast. If if you can even just watch what an agent is doing. Um when it writes the code that output tokens are by the way the the thing that drags down um the the performance. Everyone used to think it was input tokens. We've run tons of experiments with this. You can bring the input token size up and you know time to first output token now is is pretty pretty good. Like it's pretty highly optimized. The thing that really impacts performance is output tokens. So um you have to be like judicious with the way that you collect and supply context back to the agent uh so that it remains tight on its output loops as well. >> For um for one benchmark that Peter mentioned in the talk we we gave an ambitious task because obviously it's prompt dependent how much time you're adding and like with a context engine. Um but the ambitious task we gave was to implement the new adaptive thinking mode in anthropics tool chain when they introduced that which as mentioned it went from a 25m minute wall clock time to with with unblock with a context engine. The other case without was 2 and a half hours. It was 2 hours and 25 minutes. But the main reason for that was we gave it all the data. We ran the prompt and then its first output was like totally wrong. So you had to the human had to loop again and be like no no no this this this and the next output was wrong and the next output. So once you do four loops you have like a two and a half hour wall clock time versus obviously the 25minut when it did not meet that when there's no corrections required. >> Um so as mentioned it's think of it as a waterfall. The more high quality correct like high signal context you have up front the better every single thing the agent's going to do until it says it's done whether it got it right or not. Yeah. >> Yeah. >> It's got it. >> You also mentioned that uh the token usage on tool calls and like just information search really decreased. So I know that a lot of these tools that provide uh or aggregators for tool use they have insane like uh token usage. So maybe have like some estimations on how like let's say I need a slack conversation some summary from one conversation to another or like how people interact there would be like 60k tokens on composio I wonder how many tokens it would be like using unblocked >> yeah lower we're still very vibes there like it's hard to get real data from other customer or people in the market um But the again with that same I'm going to keep talking to the same task as easy. That one went from 21 million token total usage to 10 million token with the context engine. So a part of that though is because you didn't have to doom loop. >> So when when the of course like that increased a lot of the tokens expense like so we did drop it by 50% on a large task. Again obviously if you're like yo center a div you're not going to get a lot of gain. It's like probably in the training data. Um, but yeah, like any feature uh fix like so a lot of like again a lot of what people are putting through unblocked are what an engineer is doing every day. It's very rare that you're doing a task that's like so I don't know minor that like I mean then again I've asked I've asked Claude to do git push so I'm not the only one I bet I was like you do it. It's like why did that cost me 30 cents? I don't know. >> Yeah, I put I did all the effort to put my GBG keys in the right place so I'm like cloud Go. Any more questions while you all ship? Any confusion? Anything I can unblock for you? It's my purpose in life. >> Sorry, you may have answered this question already, but um so you're are you using knowledge b knowledgebased rag on in unblocked or what exactly is the tech that you are surfacing? >> Oh, so many things. Uh I can come talk to you at the side. I'll take my mic off. I'm just gonna answer that question. >> Sure. >> That was just >> this talented Yeah. >> Oh, it's it's real time basically. So, um there I guess there's there's two parts to that question. one is like how much or how frequently unblocked updates the data on the back end. Um so it's it's real time for many of the integrations and then on a a cron job for others because for those for those particular integrations they don't have web hooks basically. >> Yeah. But the the dis so that means that rebuilding the graph data has to happen on a on a very frequent basis. Yeah. >> Yeah. >> Yeah. >> Yeah. >> Yeah. >> No, it's it's incremental. So our our like you know social graph builder algorithm has an incremental component to it. So we don't have to rerun the whole thing. Um but also uh social graphs are less sensitive to frequent changes in data because it's unlikely that you know a single change is going to make a huge impact on the experts graph unless your organization is brand new. So for Yeah. >> Yes. Yeah. So, as an example, um we do best practices distillation on a much lower cadence like basically uh week by week because uh yeah, it just doesn't change that much. >> Yeah. Um, well, the Oh, yeah. Repeat your question. That's a good question. So, I want to make sure we get that one down. Oh, >> in in terms of customer privacy, data retention um kind of >> yeah from from my point of view I'm thinking of like enterprise SAS or even like on premise type deployments which I'm I'm not suggesting that you I'm just thinking of that customer kind of modality. >> Um >> yeah, do you get do you get push back? Do you how do they feel about you holding data? It's another processor in the loop. Um well so the the the privacy discussions happen at the organizational level. So it um uh we don't actually run into a lot of friction. Um there are definitely environments like in government and at banks that have uh super sensitive needs and so for those needs we have an on-prem solution but it's definitely not the path that I would recommend like staying cloud-based like we we have very large enterprise organizations uh that are entirely cloud-based, like fully cloud-based. Um the you know the the secret sauce is kind of like less encoded in source code now and more encoded in um uh the reasoning. So, organizations tend to be a little bit more sensitive around things like Slack data for instance, but uh the way that we store uh data like we have a whole white paper about how we protect customer data um and it's never been a problem. >> Yeah. >> Pardon me. Can you run on prem? >> Yes, we we do have an on-prem solution, but as I say, like it's it's not the recommended approach, but for sensitive environments, for sure. Yeah. >> Oh, why it's not recommended? Um, well, the cloud-based integrations um, you know, get updated more frequently and so there's software patches. It's a little bit harder to maintain within an organization. Uh there's there's one customer it's a bank um where administering uh the platform becomes quite difficult because they have network isolation and so like now one of us has to you know sit within that network and administer the platform or we have to train uh individuals within the company to administer the platform. So it's just it's more of a a maintenance and um handholding exercise. But yeah. >> Yeah, exactly. That's exactly right. >> Yeah. Thank you. Thanks for coming. very much.