How to maintain conversation history in Python ChatGPT bot with langchain and OpenAI API?

Hey everyone! I’m pretty new to coding and working on my first big project with OpenAI’s API. I need some help with keeping track of conversation history in my chatbot.

Basically, I want to build a bot that can read PDF files and answer questions about them. The main issue is that when I ask follow-up questions, the bot doesn’t remember what we talked about before.

For example, if I ask “What is a butterfly?” and then ask “What do they eat?”, the bot should know I’m still talking about butterflies. But right now it just gives me an error or doesn’t understand the context.

I’m using langchain with OpenAI and trying to implement ConversationBufferMemory but something isn’t working right. The bot works fine for single questions but fails when I need it to remember previous parts of our chat.

Here’s my current code:

import os
import openai
import gradio as gr
from langchain.chat_models import ChatOpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader
from langchain.prompts import PromptTemplate
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

os.environ["OPENAI_API_KEY"] = 'your-api-key-here'

def load_pdf_data():
    pdf_loader = PyPDFLoader('documents/butterfly_guide.pdf')
    documents = pdf_loader.load()
    return documents

pdf_data = load_pdf_data()

splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100)
text_chunks = splitter.split_documents(pdf_data)

embedding_model = OpenAIEmbeddings()
vector_store = Chroma.from_documents(text_chunks, embedding_model)
doc_retriever = vector_store.as_retriever(search_type="similarity")

prompt_template = """{question}"""
custom_prompt = PromptTemplate(template=prompt_template, input_variables=["question"])

conversation_memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

def get_bot_response(user_question, history):
    if user_question:
        chat_model = ChatOpenAI(temperature=0.3, model_name="gpt-3.5-turbo")
        qa_chain = ConversationalRetrievalChain.from_llm(
            chat_model, 
            doc_retriever, 
            custom_prompt, 
            verbose=True, 
            memory=conversation_memory
        )
        response = qa_chain({"question": user_question, "chat_history": history})
    return response["answer"]

def chat_interface(user_input, chat_history):
    chat_history = chat_history or []
    formatted_history = list(sum(chat_history, ()))
    formatted_history.append(user_input)
    combined_input = ' '.join(formatted_history)
    bot_response = get_bot_response(user_input, chat_history)
    chat_history.append((user_input, bot_response))
    return chat_history, chat_history

with gr.Blocks() as interface:
    gr.Markdown("""<h1><center>PDF Chat Assistant</center></h1>""")
    chat_display = gr.Chatbot()
    session_state = gr.State()
    input_box = gr.Textbox(placeholder="Ask me anything about the document...")
    send_button = gr.Button("Send Message")
    send_button.click(chat_interface, inputs=[input_box, session_state], outputs=[chat_display, session_state])

interface.launch(share=True)

I’ve been stuck on this for days and every time I try to fix it, something else breaks. Any help would be amazing!

Your problem is bad memory initialization and prompt handling. You’re creating ConversationBufferMemory globally but not connecting it to your conversation flow. Move the memory creation inside your chat interface function so each session gets its own memory instance. Your prompt template is also broken - ConversationalRetrievalChain needs both {context} and {question} variables, not just {question}. Try this: prompt_template = """Use the following context to answer the question: {context}\n\nQuestion: {question}\n\nAnswer:""". Stop manipulating chat_history manually in your chat_interface function too. Ditch the formatted_history logic and let langchain handle memory state - the chain will maintain context automatically once you configure it right.

you’re missing the key part - you need to pass memory to the chain properly. the ConversationalRetrievalChain isn’t using your memory object because you’re overriding it with gradio’s history. just remove the chat_history parameter completely from the qa_chain call and let langchain handle it. also, your prompt template needs the {context} variable or the retriever won’t work.

I hit the same issue building my first RAG chatbot. The problem’s in how you’re handling memory. You’re creating a ConversationBufferMemory instance but then passing gradio’s chat_history directly to the chain - that’s mixing two different memory systems and they clash. Just let ConversationalRetrievalChain handle memory internally. Drop the chat_history parameter from your qa_chain call and only pass the question. The chain will automatically use the ConversationBufferMemory you set up to keep context. Your prompt template’s too basic too - ConversationalRetrievalChain needs specific variables like context and question. Change your get_bot_response function to just call response = qa_chain({“question”: user_question}) without chat_history. The memory object tracks conversations behind the scenes. This worked way better for me than juggling two separate history systems.