Agentic GraphRAG: AI’s Logical Edge — Stephen Chin, Neo4j
Channel: aiDotEngineer
Published at: 2025-07-21
YouTube video id: AvVoJBxgSQk
Source: https://www.youtube.com/watch?v=AvVoJBxgSQk
[Music] My name is Steven Chin. I run the developer relations team at Neo Forj. Um, actually I just started writing a um a new book on graph rag with O'Reilly. So that was my weekend was writing chapters. um first chapter done. So I think I think we'll actually put like the pre-release version of it up soon. Um and what I am going to talk about for the next 10 minutes is aentic graph rag and a little bit about how you can accomplish this. Okay, so um first of all, who who here is familiar with graph databases? >> Okay, that's really good. Okay, you are you are well above and ahead of the curve. Um how many folks have have given um graph rag a try to to build systems using graphs? >> Okay, so one guy in the back, he's the expert. If you have any questions, ask him. And just to set the context, I'm like like why is um we we need we basically have a problem with a lot of agentric systems where they're not meeting use cases as was you know Gartner's prediction of doom and utter failure. um and they have a lot of hallucinations in them. And so um this is an example. I'm not I'm going to kind of breeze through this a little bit, but I basically asked the open3 reasoning API to solve like a a question about biases, reasoning, and math. And the the basic problem is it it doesn't do a good job if you don't give it enough deep information sources of answering things correctly. And um the question is like how many girls you can fit in a classroom? So like it's a tech computer science bias. It's a little bit about like math and reasoning because I ask it about a grid. It it it inaccurately anchors. Um these are the grid sizes which you could choose on the number of girls you can fit. It inaccurately anchors on some article about a um um non-attacking kings problem. Assumes the squares it's a square like grid. So like the audience, you guys always win against the reasoning AI even though it takes 40 seconds to to noodle on this and it it further. So I ask like um are the are the girls and boys going to go to home economics or to um um whatchamacallit to to sports and again it this is both a bias and also I misled it by giving information about the ratio of girls and boys and it comes up with an answer calculating that all the girls will go to cooking class which is which is horrible. I mean I I'm the chef at home and I I love cooking. Um so I think that these these are the biases. Now, this is kind of funny because we can reason about the situation and it's like as a problem that we can think about, but imagine if this was like in life sciences about drug discovery or if you're if you're solving a supply chain issue. So, the fact the LM has has gone and inserted biases, it's done incorrect reasoning along the way. This means you're going to get the wrong business results and it's very hard to figure this out. So basically the problem is the the LM is good at extrapolating information like like doing language tasks, figuring out things and it it gives the impression of intelligence where there's there's no real intelligence. There's no real kind of human reasoning behind it. And so we we overcribe things it can do and there's a bunch of things it can't do well. Now those those are things that knowledge graphs are actually really good at. So one way we can solve this problem is by throwing more computers at it, more agents, right? And um agentic systems are good at improving the quality of results because you have LLMs talking to each other and reasoning with each other about the problem. So like you know basically agents who observe, they think and they take actions. So you have different types of agents which are doing different things in the workflow. Um an agentic runtime might look something like this where you you're building out and you have a um orchestration layer which is is working on the agents. You have some geni models hooked up. You have some tools um and then these all collaborate to give you better results than one LM to give you can give you together. Now the challenge with this is that it's a very monolithic architecture. It's hard to maintain. It's hard to swap out the tools and it also kind of puts you in a situation where you can't secure the system. You can't do a bunch of things. So a good way to solve this is using MCP. You kind of use MCP as your tools where your agents are talking to them. Um with MCP now you have your servers and you have your data sources which are now talking to each other. Um you have your client and server. So now you can give it um you know files or database records. you can give it a graph database as the system of record and we built a bunch of tools at Neo Forj on top of MCP. So we built um a cipher tool which cipher is the query language for graph databases. Um so what basically it'll it'll use it'll when you ask the MCP server it gives capabilities to generate cipher queries off of prompts or questions or the things you want to pass it. We have a memory module. So this gives you some agent memory you can use to plug into agentic systems and then we also have um MCP on top of our cloud APIs if you want to provision databases or do different things on top of it. And I think this is a pattern you'll see with a lot of people who are the vendors or building things is now you can plug these tools into your agent architecture and you can use them together with your your graph. Typically like agents are represented in some sort of graph. This is a picture of a lang graph agent and you can layer memory on top of it. So these are all folks who just spoke in our panel. So Zep, Cognney, Memz, we're all talking about their approach to agents. Um actually they all run on top of Neo Forj. Um so they use Neo Forj as the core graph database as and some of them are pluggable. So you can choose your graph database of choice, but they're a really good way of giving memory to your agents, which is graph-based and matches the way LM want to store, communicate, and retrieve information. And um also um one of our other speakers on the graph track showed his architecture for doing video search and summarization. And if you notice um they they do a bunch of this already, right? So they they do short-term memory, they have um a whole bunch of lookup information, and they give you a choice of they they have both a graph rag pipeline and a vector pipeline. So I I highlighted the graph rag in in red. And the reason they're doing this is because when you use graph rag, you get some advantages in terms of the results coming back and typically a lower rate of hallucinations. This would be are like direct LMS are kind of you know they give you very generic responses to a healthcare question. Um baseline rag you get better results back but it's it's incomplete because it's doing um basically it's doing vector similarity. So similarity is not relevance. It it doesn't mean it actually understands the problem. Um this would be a system which does graph rag and the the typical pattern that would get you this is um first do your search in a in a vector search. You could use a vector database. We also support vector search on top of neoraj and it gives you back results and that's a good way of translating the question into like vectors and then you have mappings from the vector embeddings to your graph and you you pull back the nodes which are relevant. So in this case like you're asking about a um emphyma you'd get back the the node for emphyma you'd pull back all the related nodes and diagnosis and conditions and you see it's just it lists them all right so it's a very effective way to get really good responses back when you're dealing with something where you you kind of have this mix of structured and unstructured data. Here's a quick architecture of how you could put this together using you know you're using traversal and vector similarity. um you take in the question, you do the query against either vectors or knowledge graphs. Um graph data science or graph analytics are helpful as well to do community algorithms and groupings and things like this. Um some of this is in the Microsoft graph rag paper and other research which is going on in this area and um you feed that back as context to the LM to improve the quality of the answers. Um some of the patterns I've talked about quickly. So text to cipher is what our MCP server does. It's it can be good but sometimes it can be really bad because the generation of cipher by LLMs is not as good as you need it for some cases. Um the one I was talking about is as um which one was it? Um basically vector search with graph context, right? So you do a vector search and then you use that to pull back related nodes in graph context. That's a really good pattern to start with. And um you can also do pre and post filtering of vector results to bubble things which are more relevant to the top of the context. Um certain systems this is quite good as well because all you want to do is make sure the LM gets things higher up in the buffer for context windows. Even if LMS now have larger context windows basically what what the eval show is they ignore most of it and they look at the stuff at the top. Okay. So a quick example of a company which is doing this. So CLA basically replaced all of their SAS systems with the great graph rag project. Um they're one of our customers. Um they took an enterprise wiki's HR systems internal documentation. 250k employee questions after the first year. 2,000 daily queries process and 85% employee adoption. So like a really good adoption of this technology. Um and I'll give you a couple resources and I think we're at time. Is that about right crew? you have five minutes. >> Oh, okay. >> Oh, I see there's five minutes between sessions. I was I was rushing to do this even quicker. Okay, so we have time for questions, which is great. Um, so one resource I'd recommend is the um Neo Forj certified developer program. Um, so with the number of hands in the room here for folks who said they know graphs, I'm pretty sure you could all pass the Neo Forj certified exam if you just took it today. and we basically will will mail you a Neo forj certified t-shirt. You get a little LinkedIn badge to put on your profile and it's a nice way to just show the world that like you actually know this stuff like you know graph technology you know stuff. We'll probably add in additional certifications for like you know graph rag and other stuff in the future but the base certified developer class is a good way just to get base knowledge in this and we do have classes in graph academy on building chat bots using LMS using all this stuff as well. Um second resource is our nodes conference. So the Neo Forj nodes conference is an annual conference. It runs in three different time zones 24 hours all free content, free sessions, you know, come come out and join and check out some of this content. Okay, I'll let people finish who want to do the QR code. And thank you very much for coming. And we have a few minutes for questions. Okay, so what we'll do is anyone who has questions, just raise your hand, shout it out. If anyone wants to leave the room, feel free to do that as well. I don't want to keep you trapped in here. Okay. So, in the back >> to be clear, you embed that do a semantic search and grab is that the correct basic >> Yeah. Yeah. Okay. So, the question is like what the pattern is for for doing this type of search and that that's exactly right. So, so basically what you're doing is you're you're using the LM for what it's good at, which is language translation. So the user can enter whatever convoluted question they want, which would never translate to a beautiful cipher query. And then you you first tell the LM, well, do a vector search on that, like find vector similarity in this. And then because you've generated like the graph and the embeddings to point to each other now you can go to the graph and you can say well this these these embeddings all point to this node in the graph. So it's probably about in the in the case emphyma. Now I want to pull back the nodes which are either like you could use cosine similarity or you could use um um community grouping algorithms or different algorithms to figure out what's relevant and then pass that as context. >> So in a simple architecture the What >> the chunks you embed and your nodes are the same. >> Yeah. So, so like what we typically do and this is what'll happen if you um import unstructured data into Neo forj. Well, we we we can create a node structure out of it using LMS and then you hang your embeddings your text embeddings off as properties off of the nodes. >> Who does that? Who does that? >> Um so we have a couple plugins for this. We have a Neo Forj Python library which that has will do a lot of this. Uh we also have an integration with Langchain. >> So you can use lang chain or um llama index >> or haststack. >> You can pick your own whatever. >> Yeah. So whatever framework you want to do, you can choose and we pretty much have integrations with all of them to to help with the um >> associations. >> Yeah, the associations. >> Okay. And there were a bunch of hands, but I don't know who's first, so >> you you can go. that you have server memory. Does it handle the logic and so on or exposes all the other framework? >> Okay. So, so the question is about like how the MCP agent for memory works. Um, so that that is a great question, but >> I actually don't know the answer. It's a it's an open source MTP server. Now, Now if if you want the answer um the session which I'm giving with Michael Hunger and Jesus um in I don't know in a bit um Michael's team built all the MCP servers so he he actually will know the answer and yeah it's it's today I think it's right after this or shortly after this and Michael will go on for hours if you ask him that. So that's a a great question. Okay and we're at time you get the last question. >> Yeah. So I want to ask about your opinion between lang chain and lang graph frameworks. Are they like complimentary or they're like >> what's your perspective on that? >> Um okay so the question is like like lang chain versus lang graph. I thought I thought lang graph was like the agent thing that the lang chain folks built. No >> did did you did you use any of that? Like I'm also new to that. Yeah. So we we have a bunch of experiments and prototypes with um Langraph for for doing agents and things. Um we have integrations with Langchain. >> We also integrate with all the other memory vendors. I mean like I I would say from our perspective, use the tool that's best for you and we'll integrate with everything. >> Yeah. Okay. Thanks for coming. [Music]