I have a LangChain AI agent that queries pandas DataFrames but it’s giving me inconsistent results when processing user input.
The agent works with three datasets and can create plotly visualizations. Most of the time it performs well, but sometimes it misinterprets user queries.
For example, when users ask about Puerto Rico, the agent should look for “PRI” but sometimes searches for “PR” instead, which returns nothing. Same issue happens with Mexico - users might type “MX” but the agent doesn’t find matches even though “MEX” and “Mexico” exist in the data.
I’ve added column metadata to the pre-prompt and enabled chat history memory. The temperature is set to 0. Sometimes capitalization in user questions also causes problems despite having instructions to handle this.
How can I make the agent more reliable at interpreting different ways users refer to the same data without hardcoding every possible variation?