Hello everyone! I’ve been working with OllamaUi for a while now and I’m really curious about tracking how well my models are performing. I was wondering if there’s a way to hook up LangSmith for evaluation purposes with the OllamaUi setup. I want to get better insights into model quality and response accuracy.
If LangSmith integration isn’t possible or straightforward, what other monitoring or tracing tools would you recommend? I’m looking for something that can help me understand performance metrics and maybe track conversations over time.
Has anyone here successfully set up any kind of evaluation framework with OllamaUi? Would love to hear about your experiences and what worked best for you. Thanks in advance for any suggestions or guidance you can share!
I’ve encountered similar challenges when integrating LangSmith with OllamaUi. While a direct integration isn’t available, I implemented a custom solution using a Python script that communicates with LangSmith’s API. Although this requires additional setup, it provides valuable insights into model performance. Alternatively, consider using OpenTelemetry; it may not be as tailored for LangSmith but offers effective performance tracking and has comprehensive documentation to assist in the setup process.
I tried this about six months ago. OllamaUi doesn’t support LangSmith natively, but I built a middleware workaround that logs everything to LangSmith. I intercepted the API calls between OllamaUi and Ollama, then sent the conversation data to LangSmith for analysis. Took some custom Flask work but it tracks response quality and conversation flows really well. If you don’t want to mess with custom solutions, check out LangFuse instead. Better community support for local models and way easier setup docs than hacking LangSmith integration together. The evaluation features aren’t as powerful as LangSmith but they’ll do the job for tracking model performance.