Combining conversation memory with source attribution in LangChain RAG implementation

Dave_17Sketch · July 24, 2025, 9:31am

I’m working with LangChain in Python and I can get either conversation history or source references working separately, but not together. When I use memory management, I lose the ability to show which documents were used for the answer. When I show sources, I can’t maintain conversation context.

For conversation history I use:

assistant_prompt = (
    "You help users by answering questions. "
    "Use the retrieved information below to provide answers. "
    "If uncertain, state you don't know. Keep responses "
    "brief and under three sentences."
    "\n\n"
    "{context}"
)

reformulate_prompt = (
    "Based on chat history and the current user query "
    "that may reference previous messages, create a "
    "standalone question that makes sense without the history. "
    "Don't answer it, just rephrase if necessary."
)

reformat_template = ChatPromptTemplate.from_messages(
    [
        ("system", reformulate_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

aware_retriever = create_history_aware_retriever(
    self.model, self.doc_retriever, reformat_template
)

response_template = ChatPromptTemplate.from_messages(
    [
        ("system", assistant_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

answer_chain = create_stuff_documents_chain(self.model, response_template)
full_chain = create_retrieval_chain(aware_retriever, answer_chain)

For source tracking I use:

SOURCE_TEMPLATE = """
You assist with questions using provided context. Answer based on the retrieved information only. If you don't have enough information, say so. Keep answers short and under three sentences.
Include the page numbers you referenced at the end of your response.

<context>
{context}
</context>

Question: {question}"""

source_prompt = ChatPromptTemplate.from_template(SOURCE_TEMPLATE)

def prepare_docs(documents):
    return "\n\n".join(f"Page {doc.metadata['page_number'] + 1}:\n{doc.page_content}" for doc in documents)

source_chain = (
    RunnablePassthrough.assign(context=lambda x: prepare_docs(x["context"]))
    | source_prompt
    | self.model
    | StrOutputParser()
)

relevant_docs = self.vector_db.similarity_search(user_question)
response = source_chain.invoke({"context": relevant_docs, "question": user_question})

I think I need to create a custom retriever or modify the existing one to handle both requirements. Any ideas on how to merge these approaches?

danielr · August 3, 2025, 10:00pm

I hit this exact problem last year building a chatbot for our internal docs. The solution is combining both approaches in one chain.

Here’s what worked:

COMBINED_TEMPLATE = """
You help users by answering questions using the provided context.
Use the retrieved information below to provide answers.
If uncertain, state you don't know. Keep responses brief and under three sentences.
Always include the page numbers you referenced at the end.

<context>
{context}
</context>
"""

# Custom function to format docs with page info
def format_docs_with_sources(docs):
    formatted = "\n\n".join(f"Page {doc.metadata['page_number'] + 1}:\n{doc.page_content}" for doc in docs)
    return formatted

# Modified response template
response_template = ChatPromptTemplate.from_messages([
    ("system", COMBINED_TEMPLATE),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}"),
])

# Create the chain with source formatting
answer_chain = create_stuff_documents_chain(self.model, response_template)

# Override the document formatting in the retrieval chain
full_chain = create_retrieval_chain(aware_retriever, answer_chain)

# Add custom processing to preserve source info
def invoke_with_sources(chain, inputs):
    result = chain.invoke(inputs)
    # The documents are available in result['context']
    return result

Here’s the key: create_retrieval_chain already passes retrieved documents to your answer chain. You just modify your prompt template to include source attribution instructions and make sure the documents have the metadata you need.

Your history aware retriever still works for reformulating questions, and the final answer includes both conversation context and source references.

henryg · August 3, 2025, 5:41am

i’ve been dealing with this too. here’s what worked for me - update your document formatter in the retrieval chain to include source info. since create_retrieval_chain already passes docs through, you can grab them in your prompt template. just modify your prepare_docs function and use it in the existing chain instead of building new ones.

DancingButterfly · July 31, 2025, 4:37pm

Here’s the thing - create_retrieval_chain already gives you the retrieved documents in its output. No need for separate implementations. Keep your history-aware retriever as is, but tweak your assistant prompt for source attribution. Don’t just use {context} - grab the actual document objects that get passed through. I had luck with a custom RunnableLambda that processes documents before they hit the prompt template. This lambda formats them with page numbers while keeping the conversation flow intact. Your aware_retriever handles history reformulation, documents flow through your custom formatter, then reach the model with context and source info. Both features work at different pipeline stages - you don’t need separate chains.