I’m working on creating a conversational RAG system using LangChain that needs to handle chat history and provide source citations. I got the basic functionality working but I’m stuck on adding citation capabilities.
# Question reformulation for chat context
reformat_prompt_template = """Based on the conversation history and current user query,
create a standalone question that doesn't need prior context to understand.
Don't provide an answer, just rephrase the question if necessary."""
reformat_prompt = ChatPromptTemplate.from_messages([
("system", reformat_prompt_template),
MessagesPlaceholder("conversation_history"),
("human", "{user_query}"),
])
context_aware_retriever = create_history_aware_retriever(
language_model, document_retriever, reformat_prompt
)
# Response generation with context
response_template = """Answer the user's question using the provided context.
If the information isn't available, state that you don't have enough information.
Keep responses under five sentences.
Context: {retrieved_context}"""
response_prompt = ChatPromptTemplate.from_messages([
("system", response_template),
MessagesPlaceholder("conversation_history"),
("human", "{user_query}"),
])
document_chain = create_stuff_documents_chain(main_llm, response_prompt)
full_rag_chain = create_retrieval_chain(context_aware_retriever, document_chain)
# Session management
session_storage = {}
def retrieve_chat_history(session_key: str) -> BaseChatMessageHistory:
if session_key not in session_storage:
session_storage[session_key] = current_session
return session_storage[session_key]
conversational_chain = RunnableWithMessageHistory(
full_rag_chain,
retrieve_chat_history,
input_messages_key="user_query",
history_messages_key="conversation_history",
output_messages_key="response",
)
result = conversational_chain.invoke(
{"user_query": input_text},
config={"configurable": {"session_id": "session_001"}}
)
The chat history part works fine but I can’t figure out how to add source citations like the LangChain documentation shows with bind_tools. Has anyone successfully combined these features? I’m not sure where to modify the chain to include citation extraction.
Been dealing with citation tracking in RAG systems for years and hit this same problem. The trick is intercepting retrieval before it hits your document chain.
Add a custom function that grabs document metadata during retrieval:
Wrap this around your context_aware_retriever with a RunnableLambda. Your retrieved context automatically gets citation markers and you get a clean citations list.
Update your response template to use the citation numbers already in the context. Works way better than making the LLM generate citations from scratch.
There’s a much cleaner way to fix this citation issue without dealing with LangChain’s bind_tools mess.
Hit the same problem last month building something similar. Modifying prompts doesn’t work well because you’re still counting on the LLM to format citations right, which gets ugly fast.
What actually worked was rebuilding the whole pipeline. Skip trying to patch citations into your existing chain and build a workflow that:
Processes queries and grabs relevant docs
Generates responses with automatic source tracking
Formats citations consistently without prompt hacking
This automation approach gives you real control over citation formatting and lets you customize however you want. You can add citation validation and source ranking without touching your LangChain code.
I built mine with Latenode since it handles document processing and citation logic automatically. Just connect your retriever output to the citation formatter and you’re done.
I hit this exact problem a few months ago. Here’s what worked: modify your create_stuff_documents_chain to return source docs with the response. Build a custom chain that grabs both the answer and source metadata. Don’t just return the LLM output - wrap your document_chain with a RunnableLambda that pulls source info from retrieved docs and formats everything together. The retrieved docs already have metadata like file names and page numbers you can use for citations. When generating responses, modify the chain to return a dictionary with both ‘answer’ and ‘sources’ keys. Then just access them via result[‘answer’][‘sources’] after invoking your conversational_chain. This keeps everything in the LangChain framework without adding external tools.
you’ll need to tweak your response_template to include source info. just add “cite your sources using [doc_id]” to the end of your system prompt and make sure your retrieved docs have metadata. the document_chain should automatically pass the source_documents through - just access result[‘source_documents’] after you invoke it.