How to restrict conversation history size in Langchain applications

I’m working on a chatbot project and need to control how many past messages get sent to my language model. I want to keep only the most recent messages to avoid hitting token limits. I’m trying to use RunnableWithMessageHistory along with a custom filtering approach, but I’m running into some issues.

I have three main problems I can’t figure out:

  1. My filtering function doesn’t seem to work with RunnablePassthrough.
  2. I’m not sure if I’m handling the user input correctly with HumanMessage and input_messages_key.
  3. Should I put my message filtering logic inside the get_session_history function instead?

Here’s my current code:

from typing import List, Union
from langchain_openai import ChatOpenAI
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.runnables import RunnablePassthrough


chat_store = {}
llm_model = "gpt-3.5-turbo"
sys_prompt = "You are a helpful assistant"
llm = ChatOpenAI(model=llm_model)

def retrieve_chat_history(chat_id: str) -> BaseChatMessageHistory:
    if chat_id not in chat_store:
        chat_store[chat_id] = ChatMessageHistory()
    return chat_store[chat_id]

def limit_conversation(msgs: List[Union[HumanMessage, AIMessage]]) -> List[Union[HumanMessage, AIMessage]]:
    return msgs[-3:]

def handle_user_input(user_text: str, chat_history: BaseChatMessageHistory, chat_id: str) -> AIMessage:
    template = ChatPromptTemplate.from_messages([
        SystemMessage(sys_prompt),
        MessagesPlaceholder(variable_name="conversation"),
        HumanMessage("{input_text}")
    ])

    chain = RunnablePassthrough.assign(conversation=lambda x: limit_conversation(x["conversation"])) | template | llm

    with_history = RunnableWithMessageHistory(
        runnable=chain,
        get_session_history=retrieve_chat_history,
        history_messages_key="conversation",
        input_messages_key="current_input",
    )

    result: AIMessage = with_history.invoke(
        {"current_input": [HumanMessage(content=user_text)]},
        config={"configurable": {"session_id": chat_id}},
    )
    return result


chat_id = "demo_chat"
old_messages = [
    HumanMessage(content="My favorite hobby is reading books"),
    AIMessage(content="Reading is wonderful!"),
    HumanMessage(content="Test message one"),
    HumanMessage(content="Test message two"),
    HumanMessage(content="Test message three"),
    HumanMessage(content="My name is Alice"),
    AIMessage(content="Nice to meet you Alice!"),
]

history = retrieve_chat_history(chat_id)
for msg in old_messages:
    history.add_message(msg)

user_query = "What's my name?"
response = handle_user_input(user_query, history, chat_id)
print(response.content)

The bot should remember only the last few messages, but it seems to use the full history instead. Any ideas what I’m doing wrong?

Your filtering logic isn’t working right. Move the limit_conversation function inside retrieve_chat_history instead. That should work better - just modify it to return recent messages when you fetch history. Also check your input_messages_key parameter - it might be messing with how messages get handled.

Your limit_conversation function isn’t running because RunnableWithMessageHistory handles the conversation history on its own. You need a custom history retriever that filters messages when it returns them. Replace your retrieve_chat_history function with one that wraps the original history and only returns the last N messages from the messages property. Also, change your input_messages_key to input_text - that’s what your prompt template uses, not current_input. For the invoke call, just pass the string directly: {“input_text”: user_text}. This filters the history when it’s retrieved instead of trying to catch it halfway through the chain.