Adaptive RAG implementation using LangGraph

Adaptive RAG implementation using LangGraph

Jul 2, 2024

Introduction

Retrieval-Augmented Generation (RAG) is a technique used to create large language models (LLMs) with flexible and dynamic knowledge. For example:

  • If you want to create an AI chatbot that answers questions about your service, you could use fine-tuning
  • However, fine-tuning isn't always the best option
  • RAG provides an alternative by leveraging your vector database to make LLMs respond precisely to your queries

Today, we'll explore an advanced RAG technique called Adaptive RAG. By using flow engineering, we can create LLMs capable of handling complex prompts while ensuring efficiency.

Note: This article is based on LangChain's implementation of Adaptive RAG using LangGraph. For more details, read the original Adaptive RAG paper.

Overview

Definition

According to the paper, Adaptive-RAG is an adaptive question answering framework that dynamically adapts its strategy based on query complexity:

  1. No retrieval for the simplest queries
  2. Single-step retrieval for moderate complexity
  3. Multi-step retrieval for the most complex queries

The authors introduced a complexity classifier that categorizes prompts and uses different processes for each category. The performance gains are significant:

[Performance comparison chart placeholder]

Advantages

  • Improved Efficiency and Accuracy: Chatbots become both faster and more accurate
  • Scalability and Flexibility: Handles a wider range of queries effectively

Implementation Flowchart

We can implement Adaptive RAG using a classifier-based approach:

[User Query]  [Query Analysis]  
    ├─[Simple]  [Direct Answer]
    ├─[Moderate]  [Single Retrieval]  [LLM Generation]
    └─[Complex]  [Multi-Step Retrieval]  [LLM Generation]

Key components:

  1. Query Analysis (Classifier): Routes prompts to appropriate processes
  2. Flexible Routing:
    • Use RAG for domain-specific queries
    • Use web search (e.g., Wikipedia) for general knowledge
    • Different LLMs can be assigned per route

Practical Implementation

Classifier Example

from typing import Literal
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq

class RouteQuery(BaseModel):
    """Route a user query to the most relevant data source."""
    datasource: Literal["vectorstore", "web_search"] = Field(
        ..., description="Given a user question choose to route it to web search or vectorstore."
    )

llm = ChatGroq(model="llama3-70b-8192", temperature=0, api_key=os.environ["GROQ_API_KEY"])
structured_llm_router = llm.with_structured_output(RouteQuery)

system = """You are an expert at routing user questions to a vectorstore or web search.
The vectorstore contains information about gym exercises and routines.
Use the vectorstore for questions on these topics. Otherwise, use web-search."""

route_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

question_router = route_prompt | structured_llm_router

LangGraph Control Flow

from langgraph.graph import END, StateGraph

workflow = StateGraph(GraphState)

# Define nodes
workflow.add_node("translation", translate)  # translate to document language
workflow.add_node("translation_back", translate_back)  # translate back
workflow.add_node("detect_language", detect_language)
workflow.add_node("web_search", web_search)
workflow.add_node("retrieve", retrieve)
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("generate", generate)

# Build graph
workflow.set_conditional_entry_point(
    route_question,
    {
        "web_search": "web_search",
        "vectorstore": "detect_language",
    },
)

# Add edges
workflow.add_edge("web_search", "generate")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_edge("detect_language", "translation")
workflow.add_edge("translation", "retrieve")

# Conditional edges
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "web_search": "web_search",
        "generate": "generate",
    },
)

workflow.add_conditional_edges(
    "generate",
    grade_generation_v_documents_and_question,
    {
        "not supported": "generate",
        "useful": "translation_back",
        "not useful": "web_search",
    },
)

workflow.add_edge("translation_back", END)

# Compile
app = workflow.compile()

See the full implementation notebook for complete details.

Conclusion

Adaptive RAG provides significant benefits:

  • Dynamic retrieval based on query complexity
  • Optimized performance through intelligent routing
  • Scalable architecture using LangGraph

This approach enables sophisticated AI systems that can handle diverse query types while maintaining efficiency. The combination of classification and conditional workflows makes Adaptive RAG a powerful pattern for production-grade LLM applications.

Ismail Drissi