String or buffer error when chaining LCEL langchain RAG components

SurfingWave · August 12, 2025, 6:30am

I’m encountering an unusual error in my langchain Python project while trying to assemble a RAG system with multiple chains. The error message reads TypeError: expected string or buffer, but I’ve verified that all my variables are indeed strings.

Separately testing each chain works perfectly; however, when I attempt to connect them, the system fails. This error occurs during the embedding phase in the retrieval process.

Here’s how I’ve organized my legal chatbot:

from database.setup import db_vector
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
import os
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_core.documents import Document

load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')
language_model = OpenAI()
document_retriever = db_vector.as_retriever()

first_template = '''
You assist with legal exam questions in Brazil.
Answer this question: {question}
'''

initial_prompt = ChatPromptTemplate.from_template(first_template)
parser = StrOutputParser()
model_instance = language_model

first_chain = initial_prompt | model_instance | parser

second_template = '''
Enhance this {answer} using context from {context}
and provide additional legal references to improve it.
'''

enhancement_prompt = ChatPromptTemplate.from_template(second_template)
parallel_setup = RunnableParallel(
    {'context': document_retriever, "answer": RunnablePassthrough()}
)

full_chain = ({'answer': first_chain} | parallel_setup | enhancement_prompt | model_instance | StrOutputParser())

result = full_chain.invoke('what is constitutional law')
print(result)

The error trace indicates a failure in the tiktoken encoding phase when processing the embedding function. Has anyone else faced this issue? What could I be doing incorrectly with the chain connections?

opalEcho · August 19, 2025, 3:38am

Been there with similar chaining nightmares. Your problem is the data flow between chains - the OpenAI model output isn’t playing nice with the retriever input format.

Your first_chain spits out a completion object or formatted response, but the parallel setup wants clean string input for embedding. The retriever tries to embed whatever format comes from OpenAI, which breaks the tiktoken encoding.

Skip the complex chain debugging and automate this RAG workflow instead. Break it into separate automated steps: document retrieval, question processing, and response enhancement. Add proper data validation between each stage.

This lets you handle data transformations cleanly, add error checking at each step, and ditch the brittle chaining issues. Plus you can tweak the logic flow without rewriting chain structures.

I’ve automated similar legal document processing systems - way more reliable than debugging complex langchain pipelines. The automation handles all the data type conversions and error cases automatically.

MarkSeeker91 · August 19, 2025, 2:45am

this screams chaining structure problem. you’re using RunnablePassthrough() as the answer, then immediately crushing it with {‘answer’: first_chain}.
simplify it - ditch the nested dictionary and go with: first_chain | RunnableParallel({‘context’: document_retriever, ‘answer’: RunnablePassthrough()}) | enhancement_prompt.
that tiktoken error? happens when you feed non-string data to embeddings.

JollyMusic3 · August 19, 2025, 12:50am

I’ve encountered this issue before, and it often stems from a data type mismatch within your chaining setup. Specifically, the RunnableParallel component might be sending the entire output as ‘answer’, while your retriever expects a different input type. The tiktoken error frequently arises when an embedding function receives an unexpected format rather than a straightforward string. Consider adding print statements before your parallel setup to verify the data types being processed. A solution I found effective was to ensure that outputs are converted to strings at each step. Also, confirm that your database retriever can appropriately process the query format you’re using, as sometimes the problem lies in the improper data transfer between components rather than in the individual elements themselves. It’s possible that the output from your OpenAI model does not align with what the subsequent phase anticipates.

evelynh · August 18, 2025, 12:06pm

Your chain has a logic error. You’re creating the question output in first_chain, then trying to use that same output for document retrieval in the parallel setup.

The problem? Your retriever needs the original question, not the generated answer. The embedding function gets confused when you feed it an AI response instead of the user’s query.

I’ve seen this exact issue multiple times at work. Here’s the fix - capture the original input before it gets transformed:

def get_original_question(inputs):
    return inputs if isinstance(inputs, str) else str(inputs)

parallel_setup = RunnableParallel({
    'context': lambda x: document_retriever.get_relevant_documents(get_original_question(x)),
    'answer': first_chain
})

full_chain = parallel_setup | enhancement_prompt | model_instance | StrOutputParser()

Your retriever should search based on “what is constitutional law”, not whatever answer the first chain generates about constitutional law.

That tiktoken error happens because embeddings expect clean query strings, not formatted AI responses with tokens and metadata.

SkippingLeaf · August 17, 2025, 1:57am

The problem’s with how your output parser talks to the retriever. Even though first_chain uses StrOutputParser() and should return a clean string, your retriever’s still getting garbage data. Your RunnableParallel setup creates a race condition - the context retrieval tries to run before the answer gets formatted properly. Extract the question string explicitly before sending it to the retriever instead of trusting the chain output. I fixed this by wrapping the retriever call with a lambda that forces string conversion: ‘context’: lambda x: document_retriever.get_relevant_documents(str(x)). Also check your OpenAI model setup - if you’re using OpenAI() instead of ChatOpenAI() with ChatPromptTemplate, you’ll get format mismatches that mess up the whole chain.