Pydantic ValidationError in LangChain: AIMessage content expects string but receives different type

Hermione_Book · August 9, 2025, 3:47pm

I’m working with LangChain and OpenAI GPT-3.5 using conversational agents. My setup includes tools that process user queries and I need the final output in JSON format instead of string.

The issue is that my agent returns string responses but I noticed the intermediate steps contain JSON data. I tried setting return_intermediate_steps=True to capture this.

I’m also using conversation memory configured like this:

chat_memory = ConversationBufferMemory(
    output_key="intermediate_steps",
    return_messages=True
)

My agent setup looks like this:

my_agent = initialize_agent(
    tools=available_tools,
    llm=language_model,
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    memory=chat_memory,
    return_intermediate_steps=True
)

But I keep getting this validation error:

File “pydantic/main.py”, line 341, in pydantic.main.BaseModel.init
pydantic.error_wrappers.ValidationError: 1 validation error for AIMessage
content str type expected (type=type_error.str)

It seems the AIMessage expects string content but receives JSON format data. I want to avoid modifying LangChain’s core code. Is there a way to convert the intermediate steps JSON to string format or handle this differently?

Any suggestions would be helpful!

ZoeStar42 · August 16, 2025, 8:23am

This validation error hit me while building a document processing agent last year. You’re trying to store intermediate_steps data in ConversationBufferMemory, but those steps have complex nested objects that break Pydantic’s string requirement for AIMessage content. Don’t configure memory to handle intermediate_steps - process the agent response in two phases instead. First, run your agent with memory focused on actual conversation flow. Then extract JSON from intermediate_steps separately:

result = my_agent({"input": user_query})
conversation_output = result["output"]  # String for memory
steps_data = result.get("intermediate_steps", [])  # Complex objects for processing

Treat intermediate_steps as metadata, not conversational content. Let your memory track the human-readable conversation while handling structured data extraction separately. This kept my agents stable and still gave me the JSON outputs I needed downstream.

Alex_Thunder · August 15, 2025, 1:24pm

Had the exact same issue building a chatbot that needed structured outputs. The memory config is breaking things - intermediate_steps has objects that aimessage can’t convert to strings. Drop the output_key parameter entirely and let memory handle it normally. Pull your JSON data after the agent runs, not while it’s storing to memory. If you absolutely need intermediate steps in memory, stringify them first. But honestly? Keep them separate - it’s way cleaner long-term.

nina.k · August 14, 2025, 11:46am

I had this same issue and fixed it by scrapping manual memory setup completely. The validation error pops up because you’re shoving complex objects into fields that only accept strings.

Skip the ConversationBufferMemory headaches - I built an automated workflow that handles conversation memory and JSON extraction together. The agent runs with basic memory while a separate process grabs the steps and formats them into clean JSON.

No more messing with output_key settings or manual parsing. The automation converts data types, keeps conversation context, and spits out proper JSON without breaking LangChain’s code.

I use Latenode to run everything. It connects the LangChain agent, processes responses live, sends string content to memory, and pulls JSON from the steps. All automatic, zero validation errors.

Best part? Need to scale across multiple agents or add processing steps? Just tweak the automation instead of rewriting code.

danielr · August 14, 2025, 3:44am

Been dealing with this validation headache for years. The problem is you’re forcing ConversationBufferMemory to store complex objects when it wants simple strings.

Drop output_key="intermediate_steps" from your memory config entirely. Use this:

chat_memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

Handle JSON extraction outside the memory system. After your agent runs, grab the intermediate steps and parse them separately:

result = my_agent.run(input="your query")
if "intermediate_steps" in result:
    json_data = extract_json_from_steps(result["intermediate_steps"])

Learned this building a customer service bot that needed structured outputs. Memory handles conversation context, not complex data parsing. Keep them separate and your validation errors disappear.

This video has solid patterns for Pydantic validation that might help:

Hazel_27Yoga · August 13, 2025, 2:44am

I hit this exact issue a few months ago when extracting structured data from conversational agents. You’re mixing up the output_key in your memory config with what you actually want. The problem? You set output_key="intermediate_steps" in ConversationBufferMemory. Those intermediate_steps have complex data structures, not strings - but AIMessage’s content field only accepts strings. That’s your validation error right there. Here’s what worked for me: keep the default output_key (should be “output”) and handle JSON extraction separately. You can still use return_intermediate_steps=True on your agent, just don’t store those steps directly in conversation memory. Change your memory setup to: chat_memory = ConversationBufferMemory(return_messages=True). Then extract the JSON data from intermediate_steps in your main logic after the agent runs, not in memory. You’ll keep the conversation flow while still getting the structured data you need.