How to maintain conversation history in Python ChatGPT bot with LangChain for PDF document queries?

I’m new to coding and need help with my ChatGPT integration project.

I’m building a chatbot that reads PDF files and answers questions about them. The bot works for single questions but fails to remember previous conversation context.

What I want to achieve:

  • User asks: “Tell me about butterflies”
  • Bot responds with butterfly information
  • User asks: “How big are they?”
  • Bot should know “they” refers to butterflies from previous question

Current problem: When I ask follow-up questions, the bot doesn’t understand what I’m referring to from earlier in our chat.

import os
import gradio as gr
from langchain.chat_models import ChatOpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import TokenTextSplitter
from langchain.document_loaders import PyPDFLoader
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

os.environ["OPENAI_API_KEY"] = 'your-key-here'

def load_pdf_data():
    pdf_loader = PyPDFLoader('files/butterfly_guide.pdf')
    documents = pdf_loader.load()
    return documents

pdf_data = load_pdf_data()

splitter = TokenTextSplitter(chunk_size=800, chunk_overlap=100)
split_docs = splitter.split_documents(pdf_data)

embedding_model = OpenAIEmbeddings()
vector_store = Chroma.from_documents(split_docs, embedding_model)
doc_retriever = vector_store.as_retriever(search_type="similarity")

conversation_memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

def create_answer(user_query, conversation_history):
    if user_query:
        chat_model = ChatOpenAI(temperature=0.3, model_name="gpt-3.5-turbo")
        qa_chain = ConversationalRetrievalChain.from_llm(
            chat_model, 
            doc_retriever, 
            verbose=True, 
            memory=conversation_memory
        )
        response = qa_chain({"question": user_query, "chat_history": conversation_history})
    return response["answer"]

def chat_interface(user_input, chat_state):
    chat_state = chat_state or []
    flattened_history = list(sum(chat_state, ()))
    answer = create_answer(user_input, chat_state)
    chat_state.append((user_input, answer))
    return chat_state, chat_state

with gr.Blocks() as app:
    gr.Markdown("## PDF Document Chat Assistant")
    chatbot_ui = gr.Chatbot()
    chat_memory = gr.State()
    input_box = gr.Textbox(placeholder="Ask questions about the document...")
    send_btn = gr.Button("Send")
    send_btn.click(chat_interface, inputs=[input_box, chat_memory], outputs=[chatbot_ui, chat_memory])

app.launch(share=True)

The issue: My bot treats each question as completely separate and doesn’t maintain context between questions. How can I fix the memory handling to make conversations flow naturally?

Had the exact same problem with LangChain’s memory when I started my project. The issue is how you’re handling chat history in your ConversationalRetrievalChain. Don’t create a new chain instance every time you call create_answer - that wipes the memory clean. Set up your ConversationalRetrievalChain globally, just like you did with your vector store. Then your create_answer function can use the existing chain directly. You won’t need to pass chat history since ConversationBufferMemory handles that automatically. Also recommend setting return_source_documents=True in your chain config. Makes it way easier to see what context the model’s actually using and debug conversation issues.