I’m building an application that uses a LangChain AI agent for analyzing pandas DataFrames. The agent gets comprehensive instructions through system prompts, along with conversation history and table metadata. I’m running into problems where the agent doesn’t properly convert user text input into correct pandas operations.
The agent is supposed to answer data questions and generate plotly charts. Most of the time it works fine, but sometimes it misinterprets what users are asking for.
For example, when users search for country data, I get mixed results. Puerto Rico should use code “PRI” but sometimes the agent looks for “PR” instead and finds nothing. Mexico works when users type Mexico or MEX since both appear in different DataFrame columns. But if someone uses ‘MX’ for Mexico, the agent might search for that exact string and return empty results with no charts.
Case sensitivity also causes issues even though I added instructions about this in my system prompt.
I have metadata for all columns across my 3 datasets in the prompt. The agent has conversation memory too. I set temperature to 0 and don’t want to hardcode fixes for every possible country variation.
How can I make the agent more reliable at understanding user queries?