The Flow of GraphRAG

MartinLwx included in category ML-DL

2025-02-12 2025-02-12 921 words 5 minutes

Contents

Motivation

The current RAG techniques can not answer the global questions about the corpus. For example, we may want to know what is the topic of the corpus. Usually, the answer does not exist in the corpus but needs to understand the whole corpus and give summarization. Such global questions are called query-focused summarization (QFS) problems in this paper¹. A naive RAG technique can not handle such a situation.

And it’s unrealistic to put all the text in the corpus in the context window of LLM. Even if we could, the LLM may miss the information in the middle of the context window.

So the authors proposed GraphRAG¹ to solve the aforementioned problems.

GraphRAG

Tip

To get a better understanding of the GraphRAG algorithm, you’d better have a background in the Knowledge Graph (KG), Community Detection, etc.

Document -> TextChunk

flowchart LR
    Document --> TextChunk1 & TextChunk2 & ...

This step is quite intuitive, the document is split into multiple text chunks for further processing.

TextChunk -> Entity & Relationship & Claim

flowchart LR
    TextChunk --> Entity & Relationship & Claim

By leveraging the LLMs, we can extract the following information

Entity with name, type, description
Relationship with source, description, target
Claim

Tip

When the same entity appears in different responses from LLM, a voting mechanism can be used to determine the most frequently assigned type.
A single entity may have multiple descriptions, and their combined length could exceed the LLM’s context window. In such cases, an LLM can be used to generate a summary of these descriptions. The same approach applies to multiple descriptions of a single relationship.

The method of extraction is writing a prompt with the Few-shot learning and In-Context learning techniques. Apparently, the key to extraction is the prompt itself

Besides crafting the prompt, the authors¹ also suggest that the extraction can be performed in multiple rounds to encourage the LLM to find the missing entities in the prior round. The trick is telling the LLM that MANY entities were missed in the last extraction.

Tip

Notice: A larger text chunk size results in fewer LLM interactions, making the extraction process significantly faster. However, the authors¹ found that the count of entities/relationships would also decrease. To apply the GraphRAG technique, the size of the text chunk should be set to a reasonable value. You may refer to the Figure 2 in the origin paper.

Graph Communities Detection

flowchart LR
    Document(Document in the form of entity, relationships, and claims)
    subgraph community_tree
        root(Community1)
        root2(Community2)
        root3(Community3)
        root --> root2 & root3
        root --> dots(...)
        root2 --> dots1(...)
        root3 --> dots2(...)
        dots --> dots3(...)
    end
    Document --> community_tree

In the previous round, we actually built a KG for each document. Next, we can apply the Leiden communities detection algorithm to this KG. The output of the Leiden algorithm is a list of communities. Note that these communities are tree-structured (hereinafter referred to as the community tree), which means that they have a hierarchical structure.

Graph Communities -> Community Summaries

flowchart LR
    Community --> CommuniySummary

The community summary is also generated by LLM. For different types of communities, we have different strategies.

Leaf-level Community
- Sort all relationships in the decreasing order of combined source and target degrees. 🤔 I guess the value of degrees indicates the importance of a relationship.
- For each relationship, add the following text to the context window of LLM until the token limit is reached
  - The description of the source
  - The description of the target
  - linked claims
  - The description of the relationship
Higher-level communities (its children are sub-communities)
- Pretend that the sub-communities do not exist: process them according to the method above, as long as the token limit is not reached.
- Otherwise, sort sub-communities by their description length in decreasing order (🤔 I guess more text carries more information). And then substituting each community summary (short) with the merged description just like in the leaf-level community. In my opinion, the purpose of substitution is to keep more fine-grained information.

Tip

To summarize, the methodology of generating community summaries follows a bottom-up manner.

Community Summaries -> Community Answers -> Global Answer

flowchart LR
    a(Random shuffled community summaries)
    a --> Chunk1 & Chunk2 & annoy(...)
    Chunk1 --> p(Intermediate Answer1)
    Chunk2 --> q(Intermediate Answer2)
    annoy --> r(...)
    window(Context Window)
    p & q & r --> window
    window --> ans(Global Answer)

Now, each community is associated with a community summary. They can be exploited to answer the user’s query with the following steps:

Shuffle the community summaries randomly and split them into multiple chunks with a pre-specified token size. This step ensures that the relevant information is distributed across chunks.
Map phase: Generate intermediate answers in parallel, one for each chunk. The LLM is also asked to give the score of the generated answer which evaluates the helpfulness.
Reduce phase: Sort all the intermediate answers in decreasing order of score and put them in the context windows until the token limit is hit. The LLM will give the global answer based on the information of all communities.

Wrap-up

That’s all for this post about the GraphRAG technique. In my opinion, the flow of GraphRAG is quite intuitive and reasonable. Microsoft also provides the official implementation. However, it only supports OpenAI’s LLM and is hard to customize. After searching on Google, I found a decent implementation called nano-graphrag. I have read the source code and used a text sample to generate a knowledge graph. The results look good to me :)

Edge, Darren, et al. “From local to global: A graph rag approach to query-focused summarization.” arXiv preprint arXiv:2404.16130 (2024). ↩︎ ↩︎ ↩︎ ↩︎