Adding bind_tools method to custom LangChain LLM implementation

mikezhang · August 8, 2025, 9:11pm

I’m working on a local agent using LangChain with a custom LLM wrapper. When I try to use create_tool_calling_agent() I get this error: ValueError: This function requires a .bind_tools method be implemented on the LLM.

Here’s my custom LLM setup:

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
llm_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

def generate_response(user_prompt):
    token_ids = tokenizer.encode(user_prompt, return_tensors='pt').to(llm_model.device)
    
    stop_tokens = [
        tokenizer.eos_token_id,
        tokenizer.convert_tokens_to_ids("<|eot_id|>")
    ]
    
    generated = llm_model.generate(
        token_ids,
        max_new_tokens=200,
        eos_token_id=stop_tokens,
        do_sample=True,
        temperature=0.7,
        top_p=0.85
    )
    
    new_tokens = generated[0][token_ids.shape[-1]:]
    return tokenizer.decode(new_tokens, skip_special_tokens=True)

class MyCustomLLM(LLM):
    def _call(self, prompt, stop=None, run_manager=None, **kwargs):
        output = generate_response(prompt)
        return output
    
    @property
    def _identifying_params(self):
        return {"model_type": "custom_llama"}
    
    @property 
    def _llm_type(self):
        return "custom"

And my agent code:

tools = [TavilySearchResults(max_results=3)]
my_llm = MyCustomLLM()

system_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an AI assistant. Use tools when necessary."),
    ("placeholder", "{chat_history}"),
    ("human", "{input}"), 
    ("placeholder", "{agent_scratchpad}")
])

agent = create_tool_calling_agent(my_llm, tools, system_prompt)

How do I add the bind_tools method to make this work? Also wondering about proper streaming implementation since I think the executor might use that instead of the basic call method.

avamtz · August 16, 2025, 10:21pm

you should extend from BaseChatModel rather than LLM to enable tool calling. the bind_tools method exists, but you should override _generate instead of _call. ensure your model can handle tool schemas, or else it won’t be functional even with the correct binding.

John_Clever · August 15, 2025, 12:53am

You need to add tool schema handling to your custom LLM class. Override the bind_tools method and update your generation logic to handle tool calls. Here’s what worked for me:

class MyCustomLLM(LLM):
    bound_tools: list = []
    
    def bind_tools(self, tools, **kwargs):
        return self.__class__(bound_tools=tools, **self.__dict__)
    
    def _call(self, prompt, stop=None, run_manager=None, **kwargs):
        if self.bound_tools:
            tool_schemas = self._format_tools_for_prompt()
            prompt = f"{prompt}\n\nAvailable tools: {tool_schemas}"
        
        return generate_response(prompt)

The tricky bit is parsing tool responses correctly. Llama models don’t output structured tool calls like OpenAI does, so you’ll probably need custom parsing logic for the responses.

HappyDancer99 · August 14, 2025, 12:39pm

Had the same problem with my custom Llama wrapper. bind_tools is just the start - the real headache is getting Llama 3 to actually stick to tool calling formats. The model basically ignores tool schemas unless you beat it over the head with examples in your prompts or fine-tune it yourself. I gave up and switched to ReAct-style agents instead. Llama handles that way better right out of the box. If you’re dead set on tool calling, you’ll need to either fine-tune on tool examples or just use a model that’s actually built for function calling.

sapphireSkies · August 13, 2025, 10:56pm

Everyone’s talking about custom implementations, but you’re overengineering this.

I’ve been down this rabbit hole with local LLMs and tool calling. You can hack together bind_tools methods and wrestle with prompt formatting, but you’ll spend more time debugging than building.

What changed everything for me was moving this workflow to Latenode. Instead of fighting LangChain’s tool calling requirements, I set up the agent logic as automated workflows. The platform handles all the API calls, tool integrations, and response parsing without custom LLM wrappers.

I connected my local Llama instance through a simple HTTP endpoint, then built the agent behavior using Latenode’s visual workflow builder. Tools like Tavily search just plug right in. No bind_tools headaches, no prompt engineering nightmares.

Best part? When I want to switch models or add new tools, it’s drag and drop. No code changes.

Your current setup would work perfectly as individual workflow nodes. Local LLM for reasoning, tools as separate API calls, and Latenode orchestrating everything.

Way cleaner than monkey patching LangChain classes.

Liam23 · August 13, 2025, 7:35pm

just inherit from BaseLLM and add the method manually. worked for me with a similar setup, but you’ll need to handle the tool schema conversion yourself since llama doesn’t output reliable json.

nateharris · August 13, 2025, 7:19pm

Been there with local Llama models. Everyone suggests bind_tools and yeah, it works technically, but you’ll hit a wall with actual tool calling.

Here’s the basic bind_tools setup:

class MyCustomLLM(LLM):
    tools: list = []
    
    def bind_tools(self, tools, **kwargs):
        bound = self.__class__(**self.__dict__)
        bound.tools = tools
        return bound
    
    def _call(self, prompt, stop=None, run_manager=None, **kwargs):
        if self.tools:
            # Add tool descriptions to prompt
            tool_info = "\n".join([f"{tool.name}: {tool.description}" for tool in self.tools])
            prompt = f"{prompt}\n\nTools available: {tool_info}"
        
        return generate_response(prompt)

Spent weeks fighting this though. Llama 3 just doesn’t follow function calling protocols reliably without specific training. Sometimes it calls tools, sometimes ignores them, often formats responses wrong.

What actually worked? Ditched create_tool_calling_agent completely and switched to create_react_agent. Same tools, same setup, but works with regular text generation instead of expecting structured calls. ReAct format is way more natural for base Llama models.

Just change your agent creation:

agent = create_react_agent(my_llm, tools, system_prompt)

You’ll get consistent results without forcing tool calling behavior the model wasn’t trained for.