Setting up LangSmith tracking with Autogen AG2 GroupChat for monitoring multi-agent conversations

I’m building a file processing system with Autogen AG2 that uses multiple agents working together. My setup includes a processor agent that breaks files into sections and a reviewer agent that examines each part. These agents communicate through a GroupChat configuration.

I know AG2 works well with AgentOps for monitoring, but I want to use LangSmith instead since my other projects already use it.

Here’s my current implementation in the FileProcessor class:

class FileProcessor:
    def __init__(self):
        print("Setting up FileProcessor")
        self.reviewer = review_agent.ReviewAgent().agent
        self.processor = process_agent.ProcessAgent().agent
        self.coordinator = autogen.UserProxyAgent(
            name="Coordinator", code_execution_config=False
        )
        print("FileProcessor ready with all agents")
    
    def execute(self):
        print("Starting file processing chat")
        self.chat_group = autogen.GroupChat(
            agents=[self.coordinator, self.processor, self.reviewer],
            messages=[],
            max_round=15,
            speaker_selection_method=self._handle_transitions,
        )
        self.chat_manager = autogen.GroupChatManager(
            groupchat=self.chat_group, llm_config=config.model_settings
        )

        try:
            self.coordinator.initiate_chat(
                self.chat_manager, message="Process file_id:" + str(self.file_id)
            )
            print("Chat started successfully")
        except Exception as error:
            print(f"Chat failed: {error}")

My question: What’s the best way to add LangSmith monitoring to this AG2 GroupChat setup?

What I’ve tried: I looked online but didn’t find clear examples. I tried some AI-generated solutions but they don’t work because autogen handles the LLM calls internally.

Any examples or documentation for this would be really helpful!

I hit the same problem migrating from AgentOps to LangSmith for my document processing pipeline. The wrapper approach works, but I found something simpler that doesn’t mess with your agent setup.

Skip wrapping each agent’s config. Instead, use LangSmith’s decorator on your execute method and monkey patch the OpenAI calls before AG2 starts up. Here’s what I did:

from langsmith import traceable
from langsmith.wrappers import wrap_openai
import openai

class FileProcessor:
    def __init__(self):
        # Patch OpenAI globally before any AG2 initialization
        if not hasattr(openai, '_langsmith_wrapped'):
            openai.ChatCompletion = wrap_openai(openai.ChatCompletion)
            openai._langsmith_wrapped = True
            
        print("Setting up FileProcessor")
        self.reviewer = review_agent.ReviewAgent().agent
        self.processor = process_agent.ProcessAgent().agent
        self.coordinator = autogen.UserProxyAgent(
            name="Coordinator", code_execution_config=False
        )
    
    @traceable(project_name="autogen-file-processing")
    def execute(self):
        # Your existing code unchanged
        self.chat_group = autogen.GroupChat(
            agents=[self.coordinator, self.processor, self.reviewer],
            messages=[],
            max_round=15,
            speaker_selection_method=self._handle_transitions,
        )
        
        self.chat_manager = autogen.GroupChatManager(
            groupchat=self.chat_group, llm_config=config.model_settings
        )
        
        try:
            self.coordinator.initiate_chat(
                self.chat_manager, message="Process file_id:" + str(self.file_id)
            )
        except Exception as error:
            print(f"Chat failed: {error}")

This catches all LLM calls without touching your existing agent setup. The global patch means every OpenAI call goes through LangSmith no matter how AG2 handles internal routing. I’ve run this in production for three months with zero issues.

Don’t forget to set LANGSMITH_TRACING=true in your environment variables with your API key.

Been there with multi-agent setups. AG2’s tricky because it handles LLM calls internally, so you can’t just wrap them normally.

Here’s what worked for me in production. Use LangSmith’s context manager at the GroupChat level and instrument individual agents:

from langsmith import Client
from langsmith.wrappers import wrap_openai
import autogen

class FileProcessor:
    def __init__(self):
        # Initialize LangSmith client
        self.langsmith_client = Client()
        
        # Wrap your LLM config with LangSmith
        wrapped_config = self._wrap_llm_config(config.model_settings)
        
        self.reviewer = review_agent.ReviewAgent(llm_config=wrapped_config).agent
        self.processor = process_agent.ProcessAgent(llm_config=wrapped_config).agent
        self.coordinator = autogen.UserProxyAgent(
            name="Coordinator", 
            code_execution_config=False
        )
    
    def _wrap_llm_config(self, original_config):
        # This wraps the OpenAI client in your config
        if 'client' in original_config:
            original_config['client'] = wrap_openai(original_config['client'])
        return original_config
    
    def execute(self):
        with self.langsmith_client.trace_session(name=f"file_processing_{self.file_id}") as session:
            self.chat_group = autogen.GroupChat(
                agents=[self.coordinator, self.processor, self.reviewer],
                messages=[],
                max_round=15,
                speaker_selection_method=self._handle_transitions,
            )
            
            # Pass wrapped config to chat manager
            wrapped_manager_config = self._wrap_llm_config(config.model_settings.copy())
            self.chat_manager = autogen.GroupChatManager(
                groupchat=self.chat_group, 
                llm_config=wrapped_manager_config
            )
            
            try:
                self.coordinator.initiate_chat(
                    self.chat_manager, 
                    message="Process file_id:" + str(self.file_id)
                )
            except Exception as error:
                print(f"Chat failed: {error}")

Wrap the OpenAI client before AG2 uses it. I also add custom metadata to track which agent’s speaking:

def _handle_transitions(self, last_speaker, groupchat):
    # Your existing logic plus LangSmith metadata
    self.langsmith_client.update_current_trace(
        metadata={"current_speaker": last_speaker.name, "round": len(groupchat.messages)}
    )
    return your_speaker_selection_logic

This gives you full visibility into conversation flow and individual agent performance. I use it for monitoring 5-agent systems and it works great.

The video covers monitoring features in detail. Really helped me understand what metrics to track.

Set your LANGSMITH_API_KEY environment variable and project name before running this.

I fought with this same integration for weeks switching from AgentOps to LangSmith. The monkey patching approach works, but I kept hitting version conflicts with different OpenAI client versions.

What finally worked was using LangSmith’s run context directly in the GroupChat callbacks. AG2 has hooks you can use without touching the core LLM config:

from langsmith import trace
import autogen

class FileProcessor:
    def __init__(self):
        self.reviewer = review_agent.ReviewAgent().agent
        self.processor = process_agent.ProcessAgent().agent
        self.coordinator = autogen.UserProxyAgent(
            name="Coordinator", code_execution_config=False
        )
        
        # Override agent send methods to add tracing
        self._wrap_agents_with_tracing()
    
    def _wrap_agents_with_tracing(self):
        for agent in [self.reviewer, self.processor]:
            original_send = agent.send
            agent.send = lambda message, recipient, request_reply=None, silent=False, agent_ref=agent: \
                self._traced_send(original_send, message, recipient, agent_ref.name, request_reply, silent)
    
    def _traced_send(self, original_send, message, recipient, agent_name, request_reply, silent):
        with trace(name=f"agent_communication_{agent_name}", 
                  inputs={"message": message, "recipient": recipient.name}) as run:
            result = original_send(message, recipient, request_reply, silent)
            run.end(outputs={"response": str(result)})
            return result

This intercepts agent communications instead of LLM calls directly. You get conversation-level tracing that’s way more useful for multi-agent debugging. I can see the full conversation flow in LangSmith without any OpenAI client compatibility headaches.

Here’s the thing - you want to trace agent interactions, not just individual LLM calls. This captures the actual decision-making between agents, which is what you need when debugging failures.

Honestly, just set the LangSmith environment variables and use their auto-instrumentation. Add LANGSMITH_TRACING=true and LANGSMITH_PROJECT=“your-project” to your env, then import langsmith before autogen. It automatically picks up all OpenAI calls without any code changes. Works perfectly with my 4-agent pipeline.