I’ve been exploring LangGraph recently and I’m really impressed with what it can do. However, I’m struggling to understand one key aspect - the tool selection process.
My main question is about how the system determines which tools to use for specific tasks. What information does LangGraph rely on to understand a tool’s capabilities? I want to make sure I’m describing my tools effectively so the AI can use them properly.
As far as I can tell, the system only has access to the function definition and its parameters. Should I create very descriptive function names to provide more context about what each tool does?
Here’s a basic example I’ve been working with:
@tool
def get_user_input(message: str) -> str:
"""Pause execution to collect input from user."""
user_data = pause_for_input({"message": message})
return user_data["response"]
I’m wondering how the system knows that get_user_input will actually pause the workflow and wait for external input. The function name alone seems insufficient to convey that this tool changes the execution flow rather than just processing data.
What’s the best approach for making tool purposes clear to the LLM?
the LLM gets confused without enough context about how tools actually behave. I’ve found it helps to think like the model - it only sees text descriptions. your get_user_input example would work way better if you explicitly mention the pause/wait behavior in both the name AND docstring. maybe try wait_for_user_response instead of get_user_input - it makes the blocking behavior obvious.
Tool selection in LangGraph comes down to how clearly you explain what each tool does through metadata. Sure, docstrings and function names matter, but I’ve found the agent also looks at parameter structure and return types to figure out tool behavior.
For your pause scenario, add metadata that marks the tool as an interaction point. Some frameworks let you tag tools with categories or custom attributes that affect selection logic. The agent needs to know this isn’t just another data processing function.
What really helped me was putting prerequisites right in the docstring. If your tool needs specific workflow states or has dependencies, say so upfront. Your get_user_input tool should make it clear it’s only for when you actually need human input - not routine data collection.
The model learns from your workflow patterns too. Use certain tools at specific decision points consistently, and it’ll start recognizing those patterns. Organize your tools in logical groups so the agent builds better mental models of when to use each one.
Function names matter, but they’re just one piece. Parameter names and types actually carry more weight than most people think. The LLM looks at your entire function signature to figure out what data goes in and comes out. Your message: str parameter hints that this tool talks to someone, which gives good context. You’re spot on that the execution flow needs clearer docs though. I learned this the hard way with database tools. Had query_database() and update_database() with similar parameter names. Even with good docstrings, the agent kept using the update function for reads. Had to make the parameter names way more specific. For tools that change workflow like yours, I stick explicit warnings right at the top of the docstring. “WARNING: This tool suspends execution” - something that obvious. LLMs seem to really focus on these behavioral descriptions when picking tools. Also think about where your tool sits in the workflow. If you’ve got conditional logic that should only trigger in certain states, spell that out clearly. The agent pulls context from previous steps when choosing what to use next.
LangGraph uses way more than just function names for tool selection. The docstring is critical - that’s where you describe what the tool does and when to use it.
Your example’s on the right track but needs more detail in the docstring. I’ve hit this issue tons of times building agent workflows. The LLM reads the docstring to understand the tool’s purpose, side effects, and when to use it.
Here’s how I’d fix your example:
@tool
def get_user_input(message: str) -> str:
"""Pauses workflow execution and displays a message to collect user input.
This tool will halt the current workflow and wait for human response.
Use when you need clarification, approval, or additional information from the user.
The workflow will not continue until user provides input.
Args:
message: The prompt/question to display to the user
Returns:
The user's text response
"""
user_data = pause_for_input({"message": message})
return user_data["response"]
Be explicit about behavior, especially side effects like pausing execution. I include guidance on when to use the tool - helps the agent make better decisions during tool selection.
For complex tools, I sometimes add examples in the docstring showing typical use cases. The LLM uses all this context when deciding which tool fits the current task.
One more tip - if you have similar tools, make sure their docstrings clearly differentiate their purposes. Ambiguity leads to wrong tool choices.
Classic LangGraph issue that trips up tons of developers. The system builds tool selection from multiple signals working together.
Everyone talks about docstrings, but here’s what most people miss - the LLM looks at execution context when evaluating tools. Your get_user_input example is tricky because it breaks normal function flow.
I hit this exact problem building approval workflows. The breakthrough was realizing the model needs to understand how the tool impacts conversation state, not just its output.
Structure your tool description around conversation flow:
@tool
def get_user_input(message: str) -> str:
"""Stops the AI conversation and requests human input.
Use this when you need information that only the user can provide.
The conversation will not continue until the user responds.
This changes the conversation from AI-driven to user-driven.
"""
“Stops the AI conversation” was my lightbulb moment. It frames the tool in terms the LLM gets - conversation control instead of technical execution.
Another tip - group similar behavioral tools together in your codebase. When you’ve got multiple tools affecting workflow state, keep them close. The model picks up on these patterns when making selections.
Test your tools by running scenarios where multiple tools could work. You’ll quickly see if your descriptions are clear enough for proper selection.