How to monitor execution steps in Llama_index AI Agent workflow

I’m working on creating an AI Agent with the Llama_index Python framework and need to track the execution steps during runtime. I’ve attempted several methods but haven’t had success yet.

First approach was setting the verbose parameter:

from llama_index.core.agent.workflow import FunctionAgent
my_agent = FunctionAgent(
    tools=[my_tool],
    llm=language_model,
    system_prompt="You are a helpful AI assistant",
    verbose=True,
    allow_parallel_tool_calls=True,
)

This didn’t produce any debug output. Then I attempted using the callback system:

from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler

# Setup debug handler
debug_handler = LlamaDebugHandler()
manager = CallbackManager([debug_handler])

my_agent = FunctionAgent(
    tools=[my_tool],
    llm=language_model,
    system_prompt="You are a helpful assistant that will help me answer questions",
    callback_manager=manager,
    allow_parallel_tool_calls=True,
)

Unfortunately, this approach also failed to show the step-by-step execution. I’m currently using Llama_index version 0.12.31. What’s the correct way to enable step visibility?

Had this exact headache a few months back when we migrated our agent pipeline to the newer Llama_index version. Use the workflow handler instead of the regular callback manager.

Try this:

from llama_index.core.workflow import Event, StartEvent, StopEvent, Workflow
from llama_index.core.workflow.context import Context

class MyWorkflow(Workflow):
    def __init__(self):
        super().__init__(verbose=True)
    
    @step
    async def agent_step(self, ctx: Context, ev: StartEvent) -> StopEvent:
        # Your agent logic here
        return StopEvent(result=result)

Workflows have built-in step tracking that actually works, unlike the agent verbose flag. You can also add custom logging inside each step method to see exactly what’s happening.

Another option: use the workflow events system. Each step fires events you can capture and log. Way more reliable than the callback system.

If you’re stuck with FunctionAgent, try setting LLAMA_INDEX_DEBUG=1 before running your script. Sometimes that catches things the other methods miss.

You’re encountering a common issue with LlamaIndex’s workflow debug output. I’ve faced similar challenges when debugging agents. The verbose parameter and callback handlers don’t work well in this scenario. You need to set up the workflow event system effectively. I managed to get it working by creating a custom event handler that captures workflow events as they occur. By subclassing the workflow’s event handler and overriding the processing methods, you gain more control over the logging at each step. Alternatively, you could use the workflow’s step decorator with manual logging. When defining your workflow steps, simply add print statements or logging calls within each step function. This is generally more reliable than depending on the framework’s automatic debug features. One key point: ensure you run the agent in the correct context, as debug information only appears when the workflow is actively processing, not during the setup phase. If you’re testing with basic queries, the workflow might finish too quickly to produce any useful debug output.

try usin the workflow debugger rather than the verbose param. I faced same issue - you gotta enable logging at workflow level, not agent level. Just add logging.basicConfig(level=logging.DEBUG) before your agent creation n it should show steps correctly.

This is a common issue with newer Llama_index versions. Your callback manager approach is right, but you need to explicitly enable debug output after setting it up. After creating your debug handler, call debug_handler.print_trace_on_end = True and set the global debug flag with llama_index.set_global_handler('simple'). Also make sure you’re actually invoking the agent with .chat() or .query() methods - not just initializing it. I had the same problem until I realized debug output only shows during actual execution, not setup. The verbose parameter got deprecated in recent versions, which is why that didn’t work for you.