I have built a text processing system with Autogen AG2 that uses multiple agents working together. My setup includes a parser agent that breaks down content and a processor agent that handles the analysis. These agents communicate through GroupChat functionality.
I know AG2 works well with AgentOps for tracking, but I want to use LangSmith for monitoring instead since it fits better with my existing tools.
Here’s my current implementation in the TextProcessor class:
How do I add LangSmith tracing to track what happens in this AG2 GroupChat? I tried some approaches but since autogen handles the LLM calls internally, my attempts didn’t work. Any examples or documentation would be helpful.
I’ve had good luck with similar setups. Skip wrapping clients and use LangSmith’s callback handlers directly instead. Here’s what works: build a custom LangSmithCallbackHandler that inherits from langchain’s BaseCallbackHandler. Override the on_llm_start and on_llm_end methods, then drop this handler into your agents’ llm_config under ‘callbacks’. This way you’ll catch both LLM calls and the agent-to-agent chatter in GroupChat. I track speaker changes, message content, and timing like this. Quick heads up on your code - add error handling around callback initialization. LangSmith fails silently when the API key’s messed up. Also, set custom tags for each agent type so you can filter traces by parser vs processor in the dashboard. Your AG2 setup stays untouched and you get detailed visibility into the whole multi-agent conversation.
I hit this exact same issue trying to get proper observability into my AG2 workflows. Most solutions either miss the internal GroupChat orchestration or break when AG2 updates.
What fixed it for me was LangSmith’s automatic tracing through environment variables plus monkey patching AG2’s core message handling. I patch the _generate_oai_reply method in ConversableAgent before importing autogen - this wraps it with LangSmith’s trace context and captures every single LLM interaction without messing with your agent configs.
import langsmith
from langsmith import Client
# Patch before importing autogen
original_generate = None
def patched_generate(self, messages, sender, config):
with langsmith.trace(f"ag2_{self.name}_generation") as trace:
trace.update(inputs={"sender": sender.name, "message_count": len(messages)})
result = original_generate(self, messages, sender, config)
trace.update(outputs={"response": str(result)})
return result
import autogen
original_generate = autogen.ConversableAgent._generate_oai_reply
autogen.ConversableAgent._generate_oai_reply = patched_generate
This captures everything - GroupChat manager decisions, individual agent responses, and speaker selection logic - all in one clean trace tree. Way better than trying to wrap clients or callbacks that get lost in AG2’s message routing.
Actually, there’s an easier way - just wrap your entire execute method with the langsmith context manager instead of individual llm configs. Use with tracing_v2.trace("ag2_session") as trace: and call initiate_chat inside it. This catches everything automatically without touching agent configs.
Been down this exact path with AG2 and LangSmith integration. The trick is using the @traceable decorator from langsmith on your custom methods and then hooking into AG2’s message flow.
Wrap the OpenAI client in your llm_config and use traceable decorators on your main methods. This catches most LLM calls that AG2 makes internally.
One gotcha - set your LANGCHAIN_API_KEY and LANGCHAIN_PROJECT environment variables before running. LangSmith will show each agent’s messages and conversation flow in separate traces.
Both approaches work but they’re way more complex than needed. I’ve hit similar monitoring issues and patching AG2 internals always turns into a mess.
I just build the whole thing as a Latenode automation instead. Set up nodes for each agent - parser, processor, coordinator - and handle GroupChat logic with Latenode’s conditional routing.
Best part? Built-in monitoring and logging for every step. No wrapping OpenAI clients or context managers. You see exactly what each agent’s doing, timing, and failures.
Migrated a similar multi-agent system last month. Instead of wrestling with AG2’s internal LLM calls, I rebuilt the agent interactions as a Latenode workflow. Each agent = separate node, GroupChat speaker selection = routing logic.
Keep your existing agent code but orchestrate through Latenode instead of AG2’s GroupChat. Bonus: retry logic, error handling, and detailed logs included.