Problem Description
I’m having trouble with a langchain agent that processes CSV data with memory capabilities. The agent correctly remembers context from previous queries but returns wrong data in follow-up questions.
def process_csv_data(request_json: str):
'''
Function to handle CSV data extraction tasks.
Input format: JSON string containing query and filename
{ "query":"<your_question>", "filename":"<csv_file>" }
Example usage:
{ "query":"Show me the oldest person in data.csv", "filename":"data.csv" }
Parameters:
request_json (str): JSON formatted request string
Output:
Extracted information from the CSV file
'''
parsed_request = json.loads(request_json)
user_query = parsed_request["query"]
csv_filename = parsed_request["filename"]
data_agent = create_csv_agent(llm=OpenAI(), path=file_location, verbose=True)
return data_agent(user_query)
input_template = '{"query":"<your_question>","filename":"<csv_file>"}'
tools_description = f'Tool for CSV file operations. Required input format: {input_template}'
csv_processing_tool = Tool(
name="process_csv_data",
func=process_csv_data,
description=tools_description,
verbose=True,
)
from langchain.agents import ZeroShotAgent
from langchain.memory import ConversationBufferMemory
tools_list = [csv_processing_tool]
system_prefix = """Engage in conversation with user while maintaining context from previous interactions. Always consider chat history when answering new questions. If a user asks about specific data mentioned earlier, reference that information in your response. Available tools:"""
conversation_suffix = """Start now!
{chat_history}
User Input: {input}
{agent_scratchpad}"""
agent_prompt = ZeroShotAgent.create_prompt(
tools=tools_list,
prefix=system_prefix,
suffix=conversation_suffix,
input_variables=["input", "chat_history", "agent_scratchpad"]
)
conversation_memory = ConversationBufferWindowMemory(
memory_key='chat_history',
k=5,
return_messages=True
)
from langchain.chains import LLMChain
from langchain.agents import AgentExecutor
llm_instance = LLMChain(llm=OpenAI(temperature=0), prompt=agent_prompt)
zero_shot_agent = ZeroShotAgent(llm_chain=llm_instance, tools=tools_list, verbose=True)
executor_chain = AgentExecutor.from_agent_and_tools(
agent=zero_shot_agent,
tools=tools_list,
verbose=True,
memory=conversation_memory
)
First I query for the longest name:
request_data = {"input": {"query": "find the longest name", "filename": "people.csv"}}
json_request = json.dumps(request_data)
response = executor_chain(json_request)
This correctly returns “Johnson” as the longest name.
Then I ask a follow-up question:
follow_up = {"input": {"query": "what is this person's birth date?", "filename": "people.csv"}}
follow_up_json = json.dumps(follow_up)
result = executor_chain(follow_up_json)
The agent recognizes I’m asking about Johnson’s birth date in the first observation, but then returns the birth date from the first row of the dataset instead of Johnson’s actual birth date.
How can I fix this memory issue so the agent properly connects previous context with new data queries? I’m using GPT-3.5-turbo and have tried different prompt configurations without success.