LangGraph 201: Adding Human Oversight to Your Deep Research Agent

your AI agent in the course of the workflow is a typical ache level. If in case you have constructed your personal agentic purposes, you’ve most probably already seen this occur.

Whereas LLMs these days are extremely succesful, they’re nonetheless not fairly there but to run absolutely autonomously in a posh workflow. For any sensible agentic software, human inputs are nonetheless a lot wanted for making crucial selections and obligatory course correction.

That is the place human-in-the-loop patterns are available. And the excellent news is, you possibly can simply implement them in LangGraph.

In my earlier publish (LangGraph 101: Let’s build a deep research agent), we completely defined the core ideas of LangGraph and walked via intimately use LangGraph to construct a sensible deep analysis agent. We confirmed how the analysis agent can autonomously search, consider outcomes, and iterate till it finds ample proof to succeed in a complete reply.

One unfastened finish from that weblog is that the agent ran fully autonomously from begin to end. There isn’t any entry level for human steerage or suggestions.

Let’s repair that on this tutorial!

So, right here’s our recreation plan: we’ll take the identical analysis agent and improve it with human-in-the-loop functionalities. You’ll see precisely implement checkpoints that permit human suggestions to make your brokers extra dependable and reliable.

In case you’re new to LangGraph or desire a refresher on the core LangGraph conpcets, I extremely encourage you to take a look at my previous post. I’ll attempt to make the present publish self-contained, however could skip some rationalization given the house constraint. You will discover extra detailed descriptions in my earlier publish.

1. Downside Assertion

On this publish, we construct upon the deep analysis agent we had from the earlier publish, the place we add human-in-the-loop checkpoints in order that the consumer can overview the agent’s choice and supply suggestions.

As a fast reminder, our deep analysis agent works like this:

It takes in a consumer question, autonomously searches the online, examines the search outcomes it obtains, after which determine if sufficient info has been discovered. If that’s the case, it proceeds with making a well-crafted mini-report with correct citations; In any other case, it circles again to dig deeper with extra searches.

The illustration beneath exhibits the delta we’re constructing: the left depicts the workflow of the unique deep analysis agent, the fitting represents the identical agentic workflow however with human-in-the-loop augmentation.

Determine 1. Excessive-level flowcharts. Left: With out human-in-the-loop. Proper: Two checkpoints the place people can present inputs. (picture by writer)

Discover that we’ve added two human-in-the-loop checkpoints within the enhanced workflow on the fitting:

Checkpoint 1 is launched proper after the agent generates its preliminary search queries. The target right here is to permit the consumer to overview and refine the search technique earlier than any net searches begin.
Checkpoint 2 occurs in the course of the iterative search loop. That is when the agent decides if it wants extra info, i.e., conduct extra searches. Including a checkpoint right here would give the consumer the chance to try what the agent has discovered to date, decide if certainly ample info has already been gathered, and if not, what additional search queries to make use of.

Just by including these two checkpoints, we successfully remodel a completely autonomous agentic workflow into an LLM-human collaborative one. The agent nonetheless offers with the heavy lifting, e.g., producing queries, looking out, synthesizing outcomes, and proposing additional queries, however now the consumer would have the intervention factors to weave of their judgment.

It is a human-in-the-loop analysis workflow in motion.

2. Psychological Mannequin: Graphs, Edges, Nodes, and Human-in-the-Loop

Let’s first set up a stable psychological mannequin earlier than trying out the code. We’ll briefly focus on LangGraph’s core and human-in-the-loop mechanism. For a extra thorough dialogue on LangGraph on the whole, please consult with LangGraph 101: Let’s build a deep research agent.

2.1 Workflow Illustration

LangGraph represents workflows as directed graphs. Every step in your agent’s workflow turns into a node. Primarily, a node is a perform the place all of the precise work is completed. To hyperlink the nodes, LangGraph makes use of edges, which mainly outline how the workflow strikes from one step to the subsequent.

Particular to our analysis agent, nodes could be these containers in Determine 1, dealing with duties similar to “generate search queries,” “search the online,” or “replicate on outcomes.” Edges are the arrows, figuring out the move similar to whether or not to proceed looking out or generate the ultimate reply.

2.2 State Administration

Now, as our analysis agent strikes via totally different nodes, it must preserve monitor of issues it has realized and generated. LangGraph realizes this performance by sustaining a central state object, which you’ll be able to consider as a shared whiteboard that each node within the graph can take a look at and write on.

This fashion, every node can obtain the present state, do its work, and return solely the components it desires to replace. LangGraph would then routinely merge these updates into the principle state, earlier than passing it to the subsequent node.

This strategy permits LangGraph to deal with all of the state administration on the framework stage, in order that particular person nodes solely must concentrate on their particular duties. It makes workflows extremely modular—you possibly can simply add, take away, or reorder nodes with out breaking the state move.

2.3 Human-in-the-Loop

Now, let’s discuss human-in-the-loop. In LangGraph, that is achieved by introducing an interruption mechanism. Right here is how this sample works:

Inside a node, you insert a checkpoint. When the graph execution reaches this designated checkpoint, LangGraph would pause the workflow and current related info to the human.
The human can then overview this info and determine whether or not to edit/approve what the agent suggests.
As soon as the human gives the enter, the workflow resumes the graph run (recognized by an ID) precisely from the identical node. The node restarts from the highest, however when it reaches the inserted checkpoint, it fetches the human’s enter as a substitute of pausing. The graph execution continues from there.

With this conceptual basis in place, let’s see translate this human-in-the-loop augmented deep analysis agent into an precise implementation.

3. From Idea to Code

On this publish, we’ll construct upon Google’s open-sourced implementation constructed with LangGraph and Gemini (with Apache-2.0 license). It’s a full-stack implementation, however for now, we’ll solely concentrate on the backend logic (backend/src/agent/ listing) the place the analysis agent is outlined.

After you have forked the repo, you’ll see the next key recordsdata:

configuration.py: defines the Configuration class that manages all configurable parameters for the analysis agent.
graph.py: the principle orchestration file that defines the LangGraph workflow. We’ll primarily work with this file.
prompts.py: accommodates all of the immediate templates utilized by totally different nodes.
state.py: defines the TypedDict courses that characterize the state handed between graph nodes.
tools_and_schemas.py: defines Pydantic fashions for LLMs to supply structured outputs.
utils.py: utility features for processing searched information, e.g., extract & format URLs, add citations, and so forth.

Let’s begin with the graph.py and work from there.

3.1 Workflow

As a reminder, we purpose to enhance the present deep analysis agent with human-in-the-loop verifications. Earlier, we talked about that we wish to add two checkpoints. Within the flowchart beneath, you possibly can see that two new nodes can be added to the present workflow.

Determine 2. Flowchart for the human-in-the-loop augmented deep analysis agent. Checkpoints are added as nodes to interrupt the workflow. (Picture by writer)

In LangGraph, the interpretation from flowchart to code is simple. Let’s begin with creating the graph itself:

from langgraph.graph import StateGraph
from agent.state import (
    OverallState,
    QueryGenerationState,
    ReflectionState,
    WebSearchState,
)
from agent.configuration import Configuration

# Create our Agent Graph
builder = StateGraph(OverallState, config_schema=Configuration)

Right here, we use StateGraph to outline a state-aware graph. It accepts anOverallState class that defines what info can transfer between nodes, and a Configuration class that defines runtime-tunable parameters.

As soon as we’ve got the graph container, we are able to add nodes to it:

# Outline the nodes we'll cycle between
builder.add_node("generate_query", generate_query)
builder.add_node("web_research", web_research)
builder.add_node("reflection", reflection)
builder.add_node("finalize_answer", finalize_answer)

# New human-in-the-loop nodes
builder.add_node("review_initial_queries", review_initial_queries)
builder.add_node("review_follow_up_plan", review_follow_up_plan)

The add_node() technique takes the primary argument because the node’s identify and the second argument because the perform that may get executed when the node runs. Word that we’ve got added two new human-in-the-loop nodes in comparison with the unique implementation.

In case you cross-compare the node names with the flowchart in Determine 2, you’d see that primarily, we’ve got one node reserved for each step. Later, we’ll study the detailed implementation of these features one after the other.

Okay, now that we’ve got the nodes outlined, let’s add edges to attach them and outline execution order:

from langgraph.graph import START, END

# Set the entrypoint as `generate_query`
# Which means that this node is the primary one referred to as
builder.add_edge(START, "generate_query")

# Checkpoint #1
builder.add_edge("generate_query", "review_initial_queries")

# Add conditional edge to proceed with search queries in a parallel department
builder.add_conditional_edges(
    "review_initial_queries", continue_to_web_research, ["web_research"]
)

# Mirror on the internet analysis
builder.add_edge("web_research", "reflection")

# Checkpoint #2
builder.add_edge("reflection", "review_follow_up_plan")

# Consider the analysis
builder.add_conditional_edges(
    "review_follow_up_plan", evaluate_research, ["web_research", "finalize_answer"]
)
# Finalize the reply
builder.add_edge("finalize_answer", END)

Word that we’ve got wired the 2 human-in-the-loop checkpoints immediately into the workflow:

Checkpoint 1: after generate_querynode, the preliminary search queries are routed to review_initial_queries. Right here, people can overview and edit/approve the proposed search queries earlier than any net searches start.
Checkpoint 2: after reflectionnode, the produced examination, together with the sufficiency flag and (if any) the proposed follow-up search queries, is routed to review_follow_up_plan. Right here, people can consider whether or not the evaluation is correct and regulate the follow-up plan accordingly.

The routing features, i.e., continue_to_web_research and evaluate_research, deal with the routing logic based mostly on human selections at these checkpoints.

A fast observe on builder.add_conditional_edges(): It’s used so as to add conditional edges in order that the move could leap to totally different branches at runtime. It requires three key arguments: the supply node, a routing perform, and an inventory of attainable vacation spot nodes. The routing perform examines the present state and returns the identify of the subsequent node to go to. continue_to_web_research is particular right here, because it doesn’t truly carry out “decision-making” however reasonably allow parallel looking out, if there are a number of queries generated (or advised by the human) in step one. We’ll see its implementation later.

Lastly, we put every little thing collectively and compile the graph into an executable agent:

from langgraph.checkpoint.reminiscence import InMemorySaver

checkpointer = InMemorySaver()
graph = builder.compile(identify="pro-search-agent", checkpointer=checkpointer)

Word that we’ve got added a checkpointer object right here, which is essential for reaching human-in-the-loop performance.

When your graph execution will get interrupted, LangGraph would wish to dump the present state of the graph someplace. These states may embrace issues like all of the work executed to date, the information collected, and, in fact, precisely the place the execution paused. All the knowledge is necessary to permit the graph to renew seamlessly when human enter is supplied.

To save lots of this “snapshot”, we’ve got a few choices. For growth and testing functions, InMemorySaver is an ideal choice. It merely shops the graph state in reminiscence, making it quick and easy to work with.

For manufacturing deployment, nevertheless, you’ll wish to use one thing extra subtle. For these circumstances, a correct database-backed checkpointer like PostgresSaver or SqliteSaver could be good choices.

LangGraph abstracts this away, so switching from growth to manufacturing requires solely altering this one line of code—the remainder of your graph logic stays unchanged. For now, we’ll simply persist with the in-memory persistence.

Subsequent up, we’ll take a more in-depth take a look at particular person nodes and see what actions they take.

For the nodes that existed within the unique implementation, I’ll preserve the dialogue temporary since I’ve already lined them intimately in my previous post. On this publish, our predominant focus can be on the 2 new human-in-the-loop nodes and the way they implement the interrupt patterns we talked about earlier.

3.2 LLM Fashions

Most of our nodes within the deep analysis agent are powered by LLMs. Within the configuration.py, we’ve got outlined the next Gemini fashions to drive our nodes:

class Configuration(BaseModel):
    """The configuration for the agent."""

    query_generator_model: str = Discipline(
        default="gemini-2.5-flash",
        metadata={
            "description": "The identify of the language mannequin to make use of for the agent's question technology."
        },
    )

    reflection_model: str = Discipline(
        default="gemini-2.5-flash",
        metadata={
            "description": "The identify of the language mannequin to make use of for the agent's reflection."
        },
    )

    answer_model: str = Discipline(
        default="gemini-2.5-pro",
        metadata={
            "description": "The identify of the language mannequin to make use of for the agent's reply."
        },
    )

Word that they could be totally different from the unique implementation. I like to recommend the Gemini-2.5 collection fashions.

3.3 Node #1: Generate Queries

The generate_query node is used to generate the preliminary search queries based mostly on the consumer’s query. Right here is how this node is carried out:

from langchain_google_genai import ChatGoogleGenerativeAI
from agent.prompts import (
    get_current_date,
    query_writer_instructions,
)

def generate_query(
    state: OverallState, 
    config: RunnableConfig
) -> QueryGenerationState:
    """LangGraph node that generates a search queries 
       based mostly on the Person's query.

    Args:
        state: Present graph state containing the Person's query
        config: Configuration for the runnable, together with LLM 
                supplier settings

    Returns:
        Dictionary with state replace, together with search_query key 
        containing the generated question
    """
    configurable = Configuration.from_runnable_config(config)

    # test for customized preliminary search question rely
    if state.get("initial_search_query_count") is None:
        state["initial_search_query_count"] = configurable.number_of_initial_queries

    # init Gemini mannequin
    llm = ChatGoogleGenerativeAI(
        mannequin=configurable.query_generator_model,
        temperature=1.0,
        max_retries=2,
        api_key=os.getenv("GEMINI_API_KEY"),
    )
    structured_llm = llm.with_structured_output(SearchQueryList)

    # Format the immediate
    current_date = get_current_date()
    formatted_prompt = query_writer_instructions.format(
        current_date=current_date,
        research_topic=get_research_topic(state["messages"]),
        number_queries=state["initial_search_query_count"],
    )
    # Generate the search queries
    consequence = structured_llm.invoke(formatted_prompt)
    
    return {"query_list": consequence.question}

The LLM’s output is enforced by utilizing SearchQueryList schema:

from typing import Listing
from pydantic import BaseModel, Discipline

class SearchQueryList(BaseModel):
    question: Listing[str] = Discipline(
        description="A listing of search queries for use for net analysis."
    )
    rationale: str = Discipline(
        description="A short rationalization of why these queries are related to the analysis matter."
    )

3.4 Node #2: Evaluate Preliminary Queries

That is our first checkpoint. The thought right here is that the consumer can overview the preliminary queries proposed by the LLM and determine in the event that they wish to edit/approve the LLM’s output. Right here is how we are able to implement it:

from langgraph.varieties import interrupt

def review_initial_queries(state: QueryGenerationState) -> QueryGenerationState:
    
    # Retrieve LLM's proposals
    advised = state["query_list"]

    # Interruption mechanism
    human = interrupt({
        "form": "review_initial_queries",
        "advised": advised,
        "directions": "Approve as-is, or return queries=[...]"
    })
    final_queries = human["queries"]

    # Restrict the overall variety of queries
    cap = state.get("initial_search_query_count")
    if cap:
        final_queries = final_queries[:cap]
    
    return {"query_list": final_queries}

Let’s break down what’s occurring on this checkpoint node:

First, we extract the search queries that have been proposed by the earlier generate_query node. These queries are what the human desires to overview.
The interrupt() perform is the place the magic occurs. When the node execution hits this perform, your entire graph is paused and the payload is introduced to the human. The payload is outlined within the dictionary that’s enter to the interrupt() perform. As proven within the code, there are three fields: the form, which identifies the semantics related to this checkpoint; advised, which accommodates the checklist of LLM’s proposed search queries; and directions, which is an easy textual content that offers steerage on what the human ought to do. In fact, the payload handed to interrupt() might be any dictionary construction you need. It’s primarily a UI/UX concern.
At this level, your software’s frontend is ready to show this content material to the consumer. I’ll present you work together with it within the demo part later.
When the human gives their suggestions, the graph resumes execution. A key factor to notice is that the interrupt() name now returns the human’s enter as a substitute of pausing. The human suggestions wants to supply a queries subject that accommodates their accepted checklist of search queries. That’s what the review_initial_queries node expects.
Lastly, we apply the configured limits to forestall extreme searches.

That’s it! Current LLM’s proposal, pause, incorporate human suggestions, and resume. That’s the inspiration of all human-in-the-loop nodes in LangGraph.

3.5 Parallel Net Searches

After the human approves the preliminary queries, we route them to the online analysis node. That is achieved by way of the next routing perform:

def continue_to_web_research(state: QueryGenerationState):
    """LangGraph node that sends the search queries to the online analysis node.

    That is used to spawn n variety of net analysis nodes, one for every search question.
    """
    return [
        Send("web_research", {"search_query": search_query, "id": int(idx)})
        for idx, search_query in enumerate(state["query_list"])
    ]

This perform takes the accepted question checklist and creates parallel web_research duties, one for every question. Utilizing LangGraph’s Ship mechanism, we are able to launch a number of net searches concurrently.

3.6 Node #3: Net Analysis

That is the place the precise net looking out occurs:

def web_research(state: WebSearchState, config: RunnableConfig) -> OverallState:
    """LangGraph node that performs net analysis utilizing the native Google Search API instrument.

    Executes an online search utilizing the native Google Search API instrument together with Gemini 2.0 Flash.

    Args:
        state: Present graph state containing the search question and analysis loop rely
        config: Configuration for the runnable, together with search API settings

    Returns:
        Dictionary with state replace, together with sources_gathered, research_loop_count, and web_research_results
    """
    # Configure
    configurable = Configuration.from_runnable_config(config)
    formatted_prompt = web_searcher_instructions.format(
        current_date=get_current_date(),
        research_topic=state["search_query"],
    )

    # Makes use of the google genai shopper because the langchain shopper does not return grounding metadata
    response = genai_client.fashions.generate_content(
        mannequin=configurable.query_generator_model,
        contents=formatted_prompt,
        config={
            "instruments": [{"google_search": {}}],
            "temperature": 0,
        },
    )

    # resolve the urls to brief urls for saving tokens and time
    gm = getattr(response.candidates[0], "grounding_metadata", None)
    chunks = getattr(gm, "grounding_chunks", None) if gm is just not None else None
    resolved_urls = resolve_urls(chunks or [], state["id"]) 
    
    # Will get the citations and provides them to the generated textual content
    citations = get_citations(response, resolved_urls) if resolved_urls else []
    modified_text = insert_citation_markers(response.textual content, citations)
    sources_gathered = [item for citation in citations for item in citation["segments"]]

    return {
        "sources_gathered": sources_gathered,
        "search_query": [state["search_query"]],
        "web_research_result": [modified_text],
    }

The code is generally self-explanatory. We first configure the search, then name Google’s Search API by way of Gemini with search instruments enabled. As soon as we acquire the search outcomes, we extract URLs, resolve citations, after which format the search outcomes with correct quotation markers. Lastly, we replace the state with gathered sources and formatted search outcomes.

Word that we’ve got hardened the URL resolving and quotation retrieving towards eventualities when the search outcomes didn’t return any grounding information. Due to this fact, you’d see that the implementation for getting the citations and including them to the generated textual content is barely totally different from the unique model. Additionally, we’ve got carried out an up to date model of resolve_urls perform:

def resolve_urls(urls_to_resolve, id):
    """
    Create a map from unique URL -> brief URL.
    Accepts None or empty; returns {} in that case.
    """
    if not urls_to_resolve:
        return {}

    prefix = f"https://vertexaisearch.cloud.google.com/id/"
    urls = []

    for website in urls_to_resolve:
        uri = None
        attempt:
            net = getattr(website, "net", None)
            uri = getattr(net, "uri", None) if net is just not None else None
        besides Exception:
            uri = None
        if uri:
            urls.append(uri)

    if not urls:
        return {}

    index_by_url = {}
    for i, u in enumerate(urls):
        index_by_url.setdefault(u, i)

    # Construct secure brief hyperlinks
    resolved_map = {u: f"{prefix}{id}/{index_by_url[u]}" for u in index_by_url}
    
    return resolved_map

This up to date model can be utilized as a drop-in substitute for the unique resolve_urls perform, as the unique one doesn’t deal with edge circumstances correctly.

3.7 Node #4: Reflection

The reflection node analyzes the gathered net analysis outcomes to find out if extra info is required.

def reflection(state: OverallState, config: RunnableConfig) -> ReflectionState:
    """LangGraph node that identifies data gaps and generates potential follow-up queries.

    Analyzes the present abstract to establish areas for additional analysis and generates
    potential follow-up queries. Makes use of structured output to extract
    the follow-up question in JSON format.

    Args:
        state: Present graph state containing the operating abstract and analysis matter
        config: Configuration for the runnable, together with LLM supplier settings

    Returns:
        Dictionary with state replace, together with search_query key containing the generated follow-up question
    """
    configurable = Configuration.from_runnable_config(config)
    # Increment the analysis loop rely and get the reasoning mannequin
    state["research_loop_count"] = state.get("research_loop_count", 0) + 1
    reflection_model = state.get("reflection_model") or configurable.reflection_model

    # Format the immediate
    current_date = get_current_date()
    formatted_prompt = reflection_instructions.format(
        current_date=current_date,
        research_topic=get_research_topic(state["messages"]),
        summaries="nn---nn".be part of(state["web_research_result"]),
    )
    # init Reasoning Mannequin
    llm = ChatGoogleGenerativeAI(
        mannequin=reflection_model,
        temperature=1.0,
        max_retries=2,
        api_key=os.getenv("GEMINI_API_KEY"),
    )
    consequence = llm.with_structured_output(Reflection).invoke(formatted_prompt)

    return {
        "is_sufficient": consequence.is_sufficient,
        "knowledge_gap": consequence.knowledge_gap,
        "follow_up_queries": consequence.follow_up_queries,
        "research_loop_count": state["research_loop_count"],
        "number_of_ran_queries": len(state["search_query"]),
    }

This evaluation feeds immediately into our second human-in-the-loop checkpoint.

Word that we are able to replace the ReflectionState schema within the state.py file:

class ReflectionState(TypedDict):
    is_sufficient: bool
    knowledge_gap: str
    follow_up_queries: checklist
    research_loop_count: int
    number_of_ran_queries: int

As an alternative of utilizing an additive reducer, we use a plain checklist for follow_up_queries in order that human enter can immediately overwrite what LLM has proposed.

3.8 Node #5: Evaluate Comply with-Up Plan

The aim of this checkpoint is to permit people to validate the LLM’s evaluation and determine whether or not to proceed researching:

def review_follow_up_plan(state: ReflectionState) -> ReflectionState:
    
    human = interrupt({
        "form": "review_follow_up_plan",
        "is_sufficient": state["is_sufficient"],
        "knowledge_gap": state["knowledge_gap"],
        "advised": state["follow_up_queries"],
        "directions": (
            "To finish analysis: {'is_sufficient': true}n"
            "To proceed with modified queries: {'follow_up_queries': [...], 'knowledge_gap': '...'}n"
            "So as to add/modify queries solely: {'follow_up_queries': [...]}"
        ),
    })

    if human.get("is_sufficient", False) is True:
        return {
            "is_sufficient": True,
            "knowledge_gap": state["knowledge_gap"],
            "follow_up_queries": state["follow_up_queries"],
        }
    
    return {
        "is_sufficient": False,
        "knowledge_gap": human.get("knowledge_gap", state["knowledge_gap"]),  
        "follow_up_queries": human["follow_up_queries"],
    }

Following the identical sample, we first design the payload that can be proven to the human. This payload consists of the type of this interruption, a binary flag indicating if the analysis is ample, the data hole recognized by LLM, follow-up queries advised by the LLM, and a small tip of what suggestions the human ought to enter.

Upon examination, the human can immediately say that the analysis is ample. Or the human can preserve the sufficiency flag to be False, and edit/approve what the reflection node LLM has proposed.

Both manner, the outcomes can be despatched to the analysis analysis perform, which can path to the corresponding subsequent node.

3.9 Routing Logic: Proceed or Finalize

After the human overview, this routing perform will decide the subsequent step:

def evaluate_research(
    state: ReflectionState,
    config: RunnableConfig,
) -> OverallState:
    """LangGraph routing perform that determines the subsequent step within the analysis move.

    Controls the analysis loop by deciding whether or not to proceed gathering info
    or to finalize the abstract based mostly on the configured most variety of analysis loops.

    Args:
        state: Present graph state containing the analysis loop rely
        config: Configuration for the runnable, together with max_research_loops setting

    Returns:
        String literal indicating the subsequent node to go to ("web_research" or "finalize_summary")
    """
    configurable = Configuration.from_runnable_config(config)
    max_research_loops = (
        state.get("max_research_loops")
        if state.get("max_research_loops") is just not None
        else configurable.max_research_loops
    )
    if state["is_sufficient"] or state["research_loop_count"] >= max_research_loops:
        return "finalize_answer"
    else:
        return [
            Send(
                "web_research",
                {
                    "search_query": follow_up_query,
                    "id": state["number_of_ran_queries"] + int(idx),
                },
            )
            for idx, follow_up_query in enumerate(state["follow_up_queries"])
        ]

If the human concludes that the analysis is ample or we’ve already reached the utmost analysis loop restrict, this perform will path to finalize_answer. In any other case, it’ll spawn new net analysis duties (in parallel) utilizing the human-approved follow-up queries.

3.10 Node #6: Finalize Reply

That is the ultimate node of our graph, which synthesizes all of the gathered info right into a complete reply with correct citations:

def finalize_answer(state: OverallState, config: RunnableConfig):
    """LangGraph node that finalizes the analysis abstract.

    Prepares the ultimate output by deduplicating and formatting sources, then
    combining them with the operating abstract to create a well-structured
    analysis report with correct citations.

    Args:
        state: Present graph state containing the operating abstract and sources gathered

    Returns:
        Dictionary with state replace, together with running_summary key containing the formatted remaining abstract with sources
    """
    configurable = Configuration.from_runnable_config(config)
    answer_model = state.get("answer_model") or configurable.answer_model

    # Format the immediate
    current_date = get_current_date()
    formatted_prompt = answer_instructions.format(
        current_date=current_date,
        research_topic=get_research_topic(state["messages"]),
        summaries="n---nn".be part of(state["web_research_result"]),
    )

    # init Reasoning Mannequin, default to Gemini 2.5 Flash
    llm = ChatGoogleGenerativeAI(
        mannequin=answer_model,
        temperature=0,
        max_retries=2,
        api_key=os.getenv("GEMINI_API_KEY"),
    )
    consequence = llm.invoke(formatted_prompt)

    # Change the brief urls with the unique urls and add all used urls to the sources_gathered
    unique_sources = []
    for supply in state["sources_gathered"]:
        if supply["short_url"] in consequence.content material:
            consequence.content material = consequence.content material.substitute(
                supply["short_url"], supply["value"]
            )
            unique_sources.append(supply)

    return {
        "messages": [AIMessage(content=result.content)],
        "sources_gathered": unique_sources,
    }

With this, our human-in-the-loop analysis workflow is now full.

4. Operating the Agent: Dealing with Interrupts and Resumptions

On this part, let’s take our newly enhanced deep analysis agent for a experience! We’ll stroll via an entire interplay in Jupyter Pocket book the place a human guides the analysis course of at each checkpoints.

To easily run our present agent, you have to acquire a Gemini API key. You will get the important thing from Google AI Studio. After you have the important thing, bear in mind to create the .env file and paste in your Gemini API key: GEMINI_API_KEY=”your_actual_api_key_here”.

4.1 Beginning the Analysis

For example, within the first cell, let’s ask the agent about quantum computing developments:

from agent import graph
from langgraph.varieties import Command

config = {"configurable": {"thread_id": "session_1"}}

Q = "What are the most recent developments in quantum computing?"
consequence = graph.invoke({"messages": [{"role": "user", "content": Q}]}, config=config)

Word that we’ve got equipped a thread ID within the configuration. In actual fact, this can be a essential piece for reaching human-in-the-loop workflows. Internally, LangGraph makes use of this ID to persist the states. Later, once we resume, LangGraph will know which execution to renew.

4.2 Checkpoint #1: Evaluate Preliminary Queries

After operating the primary cell, the graph executes till it hits our first checkpoint. In case you print the leads to the subsequent cell:

consequence

You’d see one thing like this:

{'messages': [HumanMessage(content='What are the latest developments in quantum computing?', additional_kwargs={}, response_metadata={}, id='68beb541-aedb-4393-bb12-a7f1a22cb4f7')],
 'search_query': [],
 'web_research_result': [],
 'sources_gathered': [],
 'approved_initial_queries': [],
 'approved_followup_queries': [],
 '__interrupt__': [Interrupt(value={'kind': 'review_initial_queries', 'suggested': ['quantum computing breakthroughs 2024 2025', 'quantum computing hardware developments 2024 2025', 'quantum algorithms and software advancements 2024 2025'], 'directions': 'Approve as-is, or return queries=[...]'}, id='4c23dab27cc98fa0789c61ca14aa6425')]}

Discover {that a} new secret is created: __interrupt__, that accommodates the payload despatched again for the human to overview. All of the keys of the returned payload are precisely those we outlined within the node.

Now, as a consumer, we are able to proceed to edit/approve the search queries. For now, let’s say we’re pleased with the LLM’s options, so we are able to merely settle for them. This may be achieved by merely re-sending what LLM’s options again to the node:

# Human enter
human_edit = {"queries": consequence["__interrupt__"][0].worth["suggested"]}

# Resume the graph
consequence = graph.invoke(Command(resume=human_edit), config=config)

Operating this cell would take a little bit of time, because the graph will launch the searches and synthesize the analysis outcomes. Afterward, the reflection node would overview the outcomes and suggest follow-up queries.

4.3 Checkpoint #2: Evaluate Comply with-Up Queries

In a brand new cell, if we now run:

consequence["__interrupt__"][0].worth

You’d see the payload with the keys outlined within the corresponding node:

{'form': 'review_follow_up_plan',
 'is_sufficient': False,
 'knowledge_gap': 'The summaries present high-level progress in quantum error correction (QEC) however lack particular technical particulars in regards to the numerous varieties of quantum error-correcting codes being developed and the way these codes are being carried out and tailored for various qubit modalities (e.g., superconducting, trapped-ion, impartial atom, photonic, topological). A deeper understanding of the underlying error correction schemes and their sensible realization would supply extra technical depth.',
 'advised': ['What are the different types of quantum error-correcting codes currently being developed and implemented (e.g., surface codes, topological codes, etc.), and what are the specific technical challenges and strategies for their realization in various quantum computing hardware modalities such as superconducting, trapped-ion, neutral atom, photonic, and topological qubits?'],
 'directions': "To finish analysis: {'is_sufficient': true}nTo proceed with modified queries: {'follow_up_queries': [...], 'knowledge_gap': '...'}nTo add/modify queries solely: {'follow_up_queries': [...]}"}

Let’s say we agree with what the LLM has proposed. However we additionally wish to add a brand new one to the search question:

human_edit = {
    "follow_up_queries": [
        result["__interrupt__"][0].worth["suggested"][0],
        'fault-tolerant quantum computing demonstrations IBM Google IonQ PsiQuantum 2024 2025'
    ]
}

consequence = graph.invoke(Command(resume=human_edit), config=config)

We are able to resume the graph once more, and that’s it for work together with a human-in-the-loop agent.

5. Conclusion

On this publish, we’ve efficiently augmented our deep analysis agent with human-in-the-loop functionalities. As an alternative of operating absolutely autonomous, we now have a built-in mechanism to forestall the agent from going off-track whereas having fun with the effectivity of automated analysis.

Technically, that is achieved by utilizing LangGraph’s interrupt() mechanism inside fastidiously chosen nodes. A great psychological mannequin to have is like this: node hits “pause,” you edit or approve, press “play,” node restarts along with your enter, and it strikes on. All these occur with out disrupting the underlying graph construction.

Now that you’ve all this data, are you able to construct the subsequent human-AI collaborative workflow?

Source link

LangGraph 201: Adding Human Oversight to Your Deep Research Agent

Prompt Engineering Is Solved—Prompt Management Isn’t

Loop Engineering for RAG Question Parsing: The Small Loop That Runs Before Retrieval

How to Find the Optimal Coding Agent Interface

I Completed Five Years in Analytics Consulting: 5 Lessons That Changed How I Work

GPU-Resident Top-K for Agentic RAG: I Built a CUDA Kernel So My Retrieval Step Would Stop Bouncing Off the GPU

Can Machine Learning Predict the World Cup?

Prompt Engineering Is Solved—Prompt Management Isn’t

Samsung’s chip workers are jumping ship to rival SK Hynix

Tactile-Based Robot Centering as a Capability for Dexterous Manipulation

Dog tracker uses Starlink for lost pets when cell signal drops

Featured Picks

Kalshi seeks preliminary injunction as tribes oppose Arizona betting crackdown

Topoak Vision XL three/four-person hardshell rooftop tent

Stockman unveils Trekka trailer with dual kitchens

LangGraph 201: Adding Human Oversight to Your Deep Research Agent

Related Posts