Adaptive RAG implementation using LangGraph

Introduction

Retrieval-Augmented Generation (RAG) is a technique used to create large language models (LLMs) with flexible and dynamic knowledge. For example:

If you want to create an AI chatbot that answers questions about your service, you could use fine-tuning
However, fine-tuning isn't always the best option
RAG provides an alternative by leveraging your vector database to make LLMs respond precisely to your queries

Today, we'll explore an advanced RAG technique called Adaptive RAG. By using flow engineering, we can create LLMs capable of handling complex prompts while ensuring efficiency.

Note: This article is based on LangChain's implementation of Adaptive RAG using LangGraph. For more details, read the original Adaptive RAG paper.

Overview

Definition

According to the paper, Adaptive-RAG is an adaptive question answering framework that dynamically adapts its strategy based on query complexity:

No retrieval for the simplest queries
Single-step retrieval for moderate complexity
Multi-step retrieval for the most complex queries

The authors introduced a complexity classifier that categorizes prompts and uses different processes for each category. The performance gains are significant:

[Performance comparison chart placeholder]

Advantages

Improved Efficiency and Accuracy: Chatbots become both faster and more accurate
Scalability and Flexibility: Handles a wider range of queries effectively

Implementation Flowchart

We can implement Adaptive RAG using a classifier-based approach:

[User Query] → [Query Analysis] → 
    ├─[Simple] → [Direct Answer]
    ├─[Moderate] → [Single Retrieval] → [LLM Generation]
    └─[Complex] → [Multi-Step Retrieval] → [LLM Generation]

Key components:

Query Analysis (Classifier): Routes prompts to appropriate processes
Flexible Routing:
- Use RAG for domain-specific queries
- Use web search (e.g., Wikipedia) for general knowledge
- Different LLMs can be assigned per route

Practical Implementation

Classifier Example

from typing import Literal
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq

class RouteQuery(BaseModel):
    """Route a user query to the most relevant data source."""
    datasource: Literal["vectorstore", "web_search"] = Field(
        ..., description="Given a user question choose to route it to web search or vectorstore."
    )

llm = ChatGroq(model="llama3-70b-8192", temperature=0, api_key=os.environ["GROQ_API_KEY"])
structured_llm_router = llm.with_structured_output(RouteQuery)

system = """You are an expert at routing user questions to a vectorstore or web search.
The vectorstore contains information about gym exercises and routines.
Use the vectorstore for questions on these topics. Otherwise, use web-search."""

route_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

question_router = route_prompt | structured_llm_router

LangGraph Control Flow

from langgraph.graph import END, StateGraph

workflow = StateGraph(GraphState)

# Define nodes
workflow.add_node("translation", translate)  # translate to document language
workflow.add_node("translation_back", translate_back)  # translate back
workflow.add_node("detect_language", detect_language)
workflow.add_node("web_search", web_search)
workflow.add_node("retrieve", retrieve)
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("generate", generate)

# Build graph
workflow.set_conditional_entry_point(
    route_question,
    {
        "web_search": "web_search",
        "vectorstore": "detect_language",
    },
)

# Add edges
workflow.add_edge("web_search", "generate")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_edge("detect_language", "translation")
workflow.add_edge("translation", "retrieve")

# Conditional edges
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "web_search": "web_search",
        "generate": "generate",
    },
)

workflow.add_conditional_edges(
    "generate",
    grade_generation_v_documents_and_question,
    {
        "not supported": "generate",
        "useful": "translation_back",
        "not useful": "web_search",
    },
)

workflow.add_edge("translation_back", END)

# Compile
app = workflow.compile()

See the full implementation notebook for complete details.

Conclusion

Adaptive RAG provides significant benefits:

Dynamic retrieval based on query complexity
Optimized performance through intelligent routing
Scalable architecture using LangGraph

This approach enables sophisticated AI systems that can handle diverse query types while maintaining efficiency. The combination of classification and conditional workflows makes Adaptive RAG a powerful pattern for production-grade LLM applications.