If youβre building chatbots using LangGraph, youβve likely noticed that by default, they don’t remember previous messages across calls. Thatβs where Short Term Memory in LangGraph becomes incredibly powerful.
In this blog post, weβll explore how to equip your chatbot with short-term memory using LangGraphβs native tools like InMemorySaver
and trim_messages
.
π€ Why Short Term Memory in LangGraph Matters
Most real-world chatbots need some memory β not just for a smarter experience, but to keep context. Short Term Memory in LangGraph allows your bot to remember what users said earlier, so it can give coherent, contextual responses.
Letβs begin with a stateless chatbot example.
π§ͺ Example 1: LangGraph Chatbot Without Short Term Memory
Hereβs a simple implementation of a LangGraph chatbot that does not retain memory across invocations:
from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage, HumanMessage
from langgraph.graph.message import add_messages
from langgraph.graph import StateGraph, END, START
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)
class AgentState(TypedDict):
messages: Annotated[Sequence[BaseMessage], add_messages]
def simple_agent(state: AgentState) -> AgentState:
response = llm.invoke(state["messages"])
return {"messages": [response]}
def create_stateless_agent():
workflow = StateGraph(AgentState)
workflow.add_node("agent", simple_agent)
workflow.add_edge(START, "agent")
workflow.add_edge("agent", END)
return workflow.compile()
Each time you invoke this agent, it only responds to the current input, with no awareness of the previous ones.
Here is how our agent looks like in the mermaid graph.

π§ Example 2: Adding Short Term Memory in LangGraph with trim_messages
and InMemorySaver
To add short term memory in LangGraph, weβll make two enhancements:
- Use
trim_messages
to avoid token overload. - Add a
checkpointer
(InMemorySaver
) so the agent remembers the thread history.
from langchain_core.messages.utils import count_tokens_approximately, trim_messages
from langgraph.checkpoint.memory import InMemorySaver
def simple_agent(state: AgentState) -> AgentState:
trimmed_messages = trim_messages(
state["messages"],
token_counter=count_tokens_approximately,
strategy="last",
max_tokens=500,
start_on="human",
allow_partial=True
)
response = llm.invoke(trimmed_messages)
return {"messages": [response]}
def create_agent_with_memory():
workflow = StateGraph(AgentState)
workflow.add_node("agent", simple_agent)
workflow.add_edge(START, "agent")
workflow.add_edge("agent", END)
checkpointer = InMemorySaver()
return workflow.compile(checkpointer=checkpointer)
Now, this agent can maintain a short-term conversational memory by remembering prior messages using a thread_id
.
π¬ Demonstration of Short Term Memory in LangGraph
def demonstrate_with_memory():
app = create_agent_with_memory()
config = {"configurable": {"thread_id": "demo_thread"}}
print("\nπ First Invocation:")
result1 = app.invoke({"messages": [HumanMessage(content="Hi, my name is Bob")]}, config)
print([msg.content for msg in result1['messages']])
print("\nπ Second Invocation:")
result2 = app.invoke({"messages": [HumanMessage(content="What's my name?")]}, config)
print([msg.content for msg in result2['messages']])
print("\nπ Third Invocation:")
result3 = app.invoke({"messages": [HumanMessage(content="Do you remember me?")]}, config)
print([msg.content for msg in result3['messages']])
Now, the chatbot remembers that your name is Bob in the second and third calls β thanks to short term memory in LangGraph.
Here are the logs:
π First Invocation:
Output messages count: 2
Messages: ['Hi, my name is Bob', "Hi Bob, it's nice to meet you! How can I help you today?"]
π Second Invocation:
Output messages count: 4
Previous messages still there: True
Messages: [
'Hi, my name is Bob',
"Hi Bob, it's nice to meet you! How can I help you today?",
"What's my name?",
'Your name is Bob. You just told me!']
π Third Invocation:
Total conversation length: 6
All previous messages preserved: True
Messages: [
'Hi, my name is Bob',
"Hi Bob, it's nice to meet you! How can I help you today?",
"What's my name?",
'Your name is Bob. You just told me!',
'Do you remember me?',
"Yes, Bob. I remember you told me your name is Bob. I don't have personal memories like humans do, but I retain information from our current conversation. So, I remember you!"]
π Benefits of Adding Short Term Memory in LangGraph
- π§ Contextual Awareness
- π‘ Improved UX
- πͺ Minimal code changes
- β³ Short-term only: Keeps responses focused, not bloated
π§π»βπ» Here is the full Code Example
from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage, HumanMessage
from langchain_core.messages.utils import count_tokens_approximately, trim_messages
from langgraph.graph.message import add_messages
from langgraph.graph import StateGraph, END, START
from langgraph.checkpoint.memory import InMemorySaver
from langchain_google_genai import ChatGoogleGenerativeAI
# Initialize LLM
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)
# Define the agent state
class AgentState(TypedDict):
messages: Annotated[Sequence[BaseMessage], add_messages]
def simple_agent(state: AgentState) -> AgentState:
"""Simple agent that just responds to the latest message."""
trimmed_messages = trim_messages(
state["messages"],
token_counter=count_tokens_approximately,
strategy="last",
max_tokens=500,
start_on="human",
allow_partial=True
)
response = llm.invoke(trimmed_messages)
return {"messages": [response]}
def create_agent_with_memory():
"""Create agent WITH memory (InMemorySaver checkpointer)."""
workflow = StateGraph(AgentState)
workflow.add_node("agent", simple_agent)
workflow.add_edge(START, "agent")
workflow.add_edge("agent", END)
# Compile WITH checkpointer for memory
checkpointer = InMemorySaver()
return workflow.compile(checkpointer=checkpointer)
def demonstrate_with_memory():
"""Demonstrate agent behavior WITH memory."""
print("\nπ§ AGENT WITH MEMORY")
print("=" * 50)
app = create_agent_with_memory()
config = {"configurable": {"thread_id": "demo_thread"}}
# First invocation
print("\nπ First Invocation:")
result1 = app.invoke({"messages": [HumanMessage(content="Hi, my name is Bob")]}, config)
print(f"Output messages count: {len(result1['messages'])}")
print(f"Messages: {[msg.content for msg in result1['messages']]}")
# Second invocation - REMEMBERS PREVIOUS
print("\nπ Second Invocation:")
result2 = app.invoke({"messages": [HumanMessage(content="What's my name?")]}, config)
print(f"Output messages count: {len(result2['messages'])}")
print("Previous messages still there:", len(result2['messages']) > 2)
print(f"Messages: {[msg.content for msg in result2['messages']]}")
# Third invocation - STILL REMEMBERS ALL
print("\nπ Third Invocation:")
result3 = app.invoke({"messages": [HumanMessage(content="Do you remember me?")]}, config)
print(f"Total conversation length: {len(result3['messages'])}")
print("All previous messages preserved:", len(result3['messages']) > 4)
print(f"Messages: {[msg.content for msg in result3['messages']]}")
if __name__ == "__main__":
print("π¬ MEMORY COMPARISON DEMONSTRATION")
print("=" * 60)
demonstrate_with_memory()
print("\n" + "=" * 60)
π FAQs: Short Term Memory in LangGraph
β What is short term memory in LangGraph?
It refers to the ability of the chatbot to retain conversation context across invocations using in-memory checkpoints and message trimming.
β What is trim_messages used for?
To ensure token limits are respected while still maintaining recent message context.
β What is InMemorySaver?
It’s a built-in memory checkpoint manager that lets LangGraph remember previous messages based on thread ID.
β How is this different from long-term memory?
Short term memory is volatile and typically used just during active sessions. Long-term memory would require persistent storage like a vector database or file system.
β Can I scale this beyond in-memory?
Yes, LangGraph supports custom checkpointers. You can persist messages in databases like Redis, SQLite, or use LangChainβs VectorStore Memory.