Practical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j

Channel: aiDotEngineer
Published at: 2025-07-22
YouTube video id: XNneh6-eyPg
Source: https://www.youtube.com/watch?v=XNneh6-eyPg
[Music]
We are talking about graph rack today.
That's the graph rack trick of course.
Uh and we want to look at patterns for
successful graph applications uh for um
making LLMs a little bit smarter by
putting knowledge graph into the
picture. My name is Michael Hunga. I'm
VP at of product innovation at Neo Forj.
My name is Steven Shin. I lead the
developer relations at Neo Forj. And um
actually we're we're both co-authoring.
This is fun because we're both already
authors and finally we've been friends
for years and we finally get to
co-author a book. We're co-authoring
Graph Ragg, the definitive guide
for O'Reilly. So basically we didn't
sleep this past weekend because we had a
book deadline.
Yep.
So, um, I'm going to talk a little bit
about kind of at a high level what graph
is, why it's important, what we're
seeing in the media, and then Michael's
going to drill down into all of the
details and patterns, and give you a
bunch of takeaways and things you can
do. This is probably if if you want to
know how to do graph rag, Michael's
quick dive on this is the best
introduction you can get. So, I'm also
excited.
Awesome. Let's get going. Okay, so the
case for graph rag is where we're going
to start. And the challenge with using
LMS and using other patterns for this is
basically they they don't have the
enterprise domain knowledge. They don't
verify or explain the answers. They're
subject to hallucinations.
Um, and they have ethical and data bias
concerns. And you can see that very much
like our our friendly parrot here. Um,
they are all the things which parrots
behave and act like except a cute bird.
So we want to do better than this with
graph rag and figure out how we can use
domain specific knowledge accurate
contextual and explainable answers. And
really I think like what a lot of
companies and what the industry is
figuring out is it's really a data
problem. You you need good data. You
need to have data you can power your
system with. Um one of the patterns you
can do this with is rag. So you can
stick your external data into a into a
rag system. you can get stuff back from
a um a database for the pattern, but
vector databases and rag fall short
because it's it's lacking kind of your
full data set. It's it's only pulling
back a fraction of the information by
vector similarity algorithms. Typically,
a lot of the especially modern vector
databases which everyone's using,
they're they're easy to get started
with, but they're not robust. They're
not mature. They're not something which
has scalability and fallback and gives
you that what you need to get into build
a strong robust um enterprise system and
vector similarity is not the same as
relevance. So results you get back from
using a basic rag system. They're they
give you back things which are related
to the topic but it's not complete and
it's typically also not very relevant.
And then it's very hard to explain
what's coming out of the system. So we
need answer lifeline.
Yeah.
Graph rag.
And what graph rag is is we're bringing
the re we're bringing the knowledge and
the context in the environment to what
LM are good at. So you can think of this
kind of like the human brain. Our our
left brain is um our right brain is more
creative. It does more like like
building things. It does more um
extrapolation of information. Whereas
our left brain is the logical part.
That's what actually has reasoning, has
facts, and can enrich data. And it's
built off of knowledge graphs. So, a
knowledge graph is a collection of
nodes, relationships, and properties.
Here's a really simple example of a
knowledge graph where you have two
people. They they live together, you
have a car, but when you look into the
details, it's actually like a little bit
more complex than it seems at first
because they they both have a car, but
the owner of the car is not the person
who drives it. This this is kind of like
my family. My wife does all the bills,
but then she hands me the keys whenever
we get on the freeway. She she hates
driving. So, knowledge graphs also are a
great way of getting really rich data.
Um, here's an example of the Stack
Overflow graph built into a knowledge
graph where you can see all of the rich
metadata and the complexity of the
results. And we can use this to evolve
rag into a more complex system,
basically graph rag, where we get better
relevancy. We're getting more relevant
results. we get more context because now
we can actually pull back all of the
related information by graph closeness
algorithms. We can explain what's going
on because it's no longer just um
vectors. It's no longer statistical
probabilities coming out of vector
database. We actually have nodes. We
have structure. We have semantics we can
look at and we can add in security and
role-based access on top of this. So
it's contextrich, it's grounded. This
gives us a lot of power and it gives us
the ability to start explaining what
we're doing. where now we can we can
visualize it, we can analyze it and we
can log all of this. Now um this is one
of the the initial papers the the graph
rag paper from Microsoft research where
they went through this and they did they
showed that you could actually get not
only better results but less token
costs. It was actually less expensive to
do a graph rag algorithm. Um there have
been a lot of papers since then which
show all of the different research and
interesting work which is going on in
the graph rag area and um this is just a
quick view of the different studies and
results which are coming out but even
from the early data.orld study where
they showed a three times improvement in
graph rag capabilities
and the analysts are even showing how
graphreg is trending up. So this is the
um Gartner um kind of hype cycle from
from 2024 and you can see generic AI is
kind of you know on the downtrend. Rag
is getting over the hump but graph rag
and a bunch of these things actually are
providing and breathing more life into
the AI ecosystem. So a lot of great
reports from from Gartner showing that
it's grounded in facts. It resolves
hallucinations together. knowledge
graphs and AI are solving these problems
and it's getting a lot of adoption by
different industry leaders by big
organizations um who are taking
advantage of this and actually producing
production applications and making it
work like LinkedIn customer support
where they actually wrote this great
research paper where they showed that
using a knowledge graph for customer
support scenarios actually gave them
better results and allowed them to
improve the um quality and reduce the
response time for getting back to
customers. Um, median perissue
resolution time was reduced by 28.6%.
I mentioned the data.world study which
basically was a comparison of doing um,
rag on SQL versus rag on graph databases
and they showed a three times
improvement in accuracy of LM responses
and let's chat about patterns Michael
because I think everyone's here to learn
how to do this.
Exactly. So let's let's look at how to
do this actually. Right. So and um if
you look at graph rack u that actually
two sides to the coin. So one of course
you don't start in a vacuum you have to
create your knowledge graph right. So VC
basically multiple steps to get there.
Initially you get unstructured
information. You substructure it. You
put it into a lexical graph which
represents documents chunks and their
relationships. In a second step, you can
then extract entities using for instance
LLMs with this graph schema to extract
entities and the relationships from that
graph. And in a third phase, you would
enrich this graph for instance with
graph algorithms doing things like you
know page rank, community summarization
and and so on. And then when you have
this uh built-up knowledge graph, then
you do graph rack as the as the search
mechanism um either with local search or
global search and and um other ways.
Right? So let's first look at the first
phase of like knowledge graph
construction a little bit. Um so like
always in data engineering there is if
you want to have higher quality outputs
you have to put in more effort at the
beginning right so it's basically
nothing comes for free there's no free
lunch after all but what you do at the
beginning is basically paying off
multiple times because what you get out
of your unstructured documents is
actually highly high high quality high
structured information which you then
can use to extract contextual
information for your for your queries
which allows the rich retrieval at the
end.
Okay. And so after seeing uh graph rack
being used uh by a number of users
customers we've seen uh we looked at
research papers we we saw that a number
of patterns emerging uh in terms of like
how we structure our graphs how we query
these graphs and so on and so we started
to collect these patterns and put them
on graph.com
um and we want to I wanted to show what
what this looks like. So we have
basically uh example graphs uh in the
pattern the pattern has a name
description uh context and we see also
queries that are used for extracting
this information. Right? So for instance
here's an uh mix of a lexical graph and
a domain graph and then we can have the
query that fetches uh this uh
information. Let's look at the three
steps in a little bit more detail on the
um on the graph model side. So on one
side we have uh for lexig graphs you
represent documents and the elements. So
that could be something simple as a
chunk. But if you have structured
element documents, you can also do
something like okay have a book which
has chapters which have sections which
have paragraphs where the paragraph is
the semantically cohesive unit that you
would use to for instance create a
vector embedding of that you can use
later for vector search. But what's
really interesting in the graph is
basically you can connect these things
all up right so you know exactly who's
the predecessor who's the successor to a
chunk who's the parent of an element and
using something like a vector or text
similarity you can also connect these uh
chunks as well by an K nearest neighbor
or similarity graph where you basically
store similarities u between chunks and
then you put on the relationship between
them and an and weighted score basically
how similar the two chunks and then you
can use all these relationships when you
extract the context in the retrieval
phase to for instance find what are
related chunks by document by uh
temporal sequence by similarity and
other things right so that's on the on
the lexical side um this looks like this
so you for instance you have an RFP and
you want to break it up in a structured
way then you basically create the
relationships between these chunks uh or
the the these subsections at the text do
the vector embeddings and then you do it
at scale and then you get a full uh
lexical uh graph graph out of that. Next
phase is entity extraction. Uh which is
also something that has been around for
quite some time with NLP but LLMs
actually take this to the next level
with their multi- language understanding
with their high flexibility, good
language skills for extraction. So you
basically provide an graph schema and an
um instruction prompt to the LLM plus
your pieces of information, pieces of
text. Now with large context windows you
can then put put in 10,000 100,000
tokens for extraction. If you have you
can also put in already existing ground
truth. So for instance, if you have ex
existing structure data where your
entities, let's say products or genes or
partners or clients are already
existing, then you can also pass this in
as part of the prompt. So that the LLM
doesn't do an extraction, but more an
recognition and and finding um approach
where you find your entities and then
you extract relationships from them and
then you can store additional facts and
and uh additional information that you
store as part of relationships and
entities as well. So basically in the
first part you have the lexical graph
which is representing document
structure. But then in the second part
you extract the relevant entities and
their relationships. If you have already
an existing knowledge graph you can also
connect this to an existing knowledge
graph. So imagine you have an um CRM
where you already have customer clients
uh and and leads in your knowledge graph
but then you want to enrich this with
for instance uh protocols from call
transcripts and then you basically
connect this to your existing structured
data as well. So that's also a
possibility. And then in the next phase
what you can do is you can run graph
algorithms for enrichment
which then for instance can do
clustering on the entity graph and then
you generate uh something like uh
communities where an LLM can generate
summaries uh across them as such right
and uh for especially last one it's
interesting because what you identify is
actually cross document uh topics right
so because it's basically each document
is an basically temporal vertical
representation of information but What
this is is actually it looks at which
topics are reoccurring across many
different documents. So you find these
kind of topic clusters across uh
documents as well. Cool. So if you look
at the the second phase, the search
phase which is basically retrieval uh
part of rag. What we see here is
basically that in a graphic retriever
you don't just do a simple vector lookup
to get uh returns uh results returned
but what you do you do an initial index
search. It could be vector search, full
text search, hyper search, spatial
search other kinds of searches to find
the entry points in your graph and then
you basically uh can take as you can see
here um starting from these entry points
you then follow the relationships up to
a certain degree or up to a certain
relevancy to fetch in uh additional
context and this context can be coming
from the user question. It can be
external user context that comes in. For
instance, when someone from let's say
your uh finance department is looking at
your data, you return different
information and if someone from the
let's say engineering department is is
looking at your data, right? So it also
takes this external context into account
how much and which context you retrieve
and then you return to the LLM to
generate the answer. Not just basically
text fragments like you would do in
vector search but you also create the
return these um more complete uh subset
of the of the contextual graph uh to the
LLM as well. And modern LLMs are
actually more trained on uh graph
processing as well. So they can actually
deal with these uh additional pattern
structures where you have uh node
relationship node patterns uh that you
provide as additional context uh to the
LLM. Um and then of course I mentioned
that you can enrich it using graph
algorithms. So basically you can do
things like uh clustering, link
prediction, pitch rank and other things
to enrich your data. Cool. Let's look at
some uh practical examples. We don't
have too much time left. Uh so one is
knowledge of construction from
unstructured sources. So there's a
number of libraries. Uh you've already
heard some uh today from people that do
these kind of things. Um so one thing
that we built is an a tool that allows
you to take PDFs uh YouTube uh
transcripts uh local documents, web
articles, Wikipedia articles and it
extracts your uh data into an graph. And
let me just switch over to the to the uh
demo here. Uh so this is the this is the
tool. uh so I uh uploaded uh information
from different Wikipedia pages, YouTube
videos, articles and so on. And here's
for instance an Google DeepMind uh
extraction. So you can use a lot of
different LLMs here. And then you can
also if you want to in graph enhancement
provide graph schema as well. So you can
for instance say a person uh works for
uh a company and uh add these patterns
uh to your um to your schema and then
the LLM is using this information to
drive the extraction uh as well. And so
if you look at the data that has been
extracted from uh deep mind that is this
one here we can actually see
from the Wikipedia article um two
aspects. one is the document with the
chunks which is this uh part of the of
the graph right and then the second part
is the entities that have been extracted
from from this uh article as well. So
you see actually the connected knowledge
graph of entities which are companies,
locations, people and technologies. So
it followed our um followed our schema
to extract this and then if I want to
run graph rag you have here a number of
different retrievers. So we have vector
retriever, graph and full text, entity
retrievers and others uh that you can
select. Uh all of this is also an open
source project. So you can just go to
github and have a look at this. And so I
just ran this before because internet is
not so reliable here. So what has deep
mind worked on and I get an detailed
explanation. And then if I want to I can
here look at uh details. So it shows me
which sources did it use alphafly mind
Wikipedia another PDF. I see which
chunks have been used which is basically
the full text and hybrid search. But
then I also see which entities have been
used from the graph. So I can actually
really see from an explanability
perspective these are the entities that
have been retrieved by the graph
retriever passed to the LLM in addition
uh to the text that's connected to these
entities. So it gets an richer response
uh as such and then you can also do eval
with with raas as well.
Um so while I'm on the screen uh let me
just show you another thing uh that we
worked on which is more like an
energetic approach where you basically
put these individual retrievers into an
an configuration where you have
basically domain specific retrievers uh
that uh are um running individual cypher
queries. So for instance, if you look at
uh let's say this one, it has uh the
query here and basically a tool with
inputs and a description and then we can
have an agentic um loop using these
tools basically doing uh graphic with
each individual tool taking the
responses and then doing uh deeper uh
tool calls. Uh I'll show you an deeper
example in a in a minute. So this is
basically what I showed you. This is all
available as uh open source libraries.
You can use it yourself in from Python
as well. Uh or it showed neo converse
which is also able not to just output
text but also uh charts and other
visualizations uh networks uh
visualizations as well.
And what's interesting here in the
agentic approach, you don't just use
vector search to retrieve your data, but
you basically break down the user
question into individual tasks and
extract parameters and run these
individual tools. Um, which then are
either run in sequence or in a loop to
uh return the data. And then you get
basically these uh outputs back and uh
basically for each of these things
different individual tools are called
and and used here. And the last thing
that I want to show is the uh graph
python package uh which is basically
also encapsulating uh all of this in
construction and the retrieval in into
one package. So you can build the
knowledge graph, you can implement the
retrievers and create the pipelines
here. And here's an example of where I
pass in uh PDFs plus a graph schema and
then basically uh it runs uh the input
into NEFJ and then I can uh in the
Python notebook visualize uh the data
later on. And with that I leave you with
one second uh the takeaway which is on
graph.com you find all of these
resources a lot of the patterns and uh
we'd love to have contributions and love
to talk more. I'm outside at the at the
booth if you have more questions.
Yeah. So now that was great and I think
you're getting it all from the expert
with all the tooling. Actually Michael's
team builds a lot of the tools like
knowledge graph builder. Um, very
excited you all came to the graph rag
track and hope to chat with you all
more. If you have questions for me and
Michael, just meet us in the Neo Forj
booth across the way. Thank you.
Thank you.