I’ve been working with Langsmith for managing prompts and tracking my language model workflows. At first everything seemed pretty solid and I was happy with the features. But lately I’m running into some serious issues that are making my development work really difficult.
The main problems I’m seeing are constant UI changes that break my workflow and really slow performance that sometimes makes the platform completely unusable. As someone who needs reliable tools for building applications, this has become a major blocker for my projects.
I’m wondering if other developers have found good open source tools that can handle prompt management and tracing for LLM applications. What alternatives have worked well for you? I need something stable that won’t change the interface every few days and actually loads when I need to use it.
Been there. Langsmith’s instability was a nightmare when I built a production chatbot last year.
I switched to Langfuse - it’s open source and nails prompt versioning and tracing. Takes an hour to set up, then it just works. No random UI changes breaking your workflow.
For simple stuff, I use Wandb for experiments and store prompts in git. Works great without fancy dashboards.
Phoenix by Arize is another solid pick. Less popular but rock solid. Their tracing beats Langsmith in some areas. UI’s fast and doesn’t randomly rearrange itself.
Honestly? Consider rolling your own basic tracking with Python logging and a simple database. Did this once and it crushed any third-party tool for reliability.
I switched from Langsmith to MLflow about six months ago - way better experience. MLflow’s tracking is solid and the interface doesn’t change every five minutes like Langsmith did. That constant disruption was killing my workflow. The prompt logging and experiment management just work without headaches. You should also check out Weights & Biases. Yeah, it’s trickier to set up and there’s more of a learning curve, but it’s rock solid once you get it running. I’d start with MLflow though - migrating is pretty painless and their docs are actually helpful.
OpenLLMetry’s worth checking out. I switched to it after Langsmith kept crashing during a client project. It’s made for LLM observability and drops right into Python workflows without breaking everything. Tracing works well and it’s been rock solid for me. Love that setup’s dead simple and it doesn’t try to do a million things poorly.
Helicone’s another solid pick if you’re using OpenAI APIs. Not as fancy as others, but it nails the basics - monitoring and logging just work. Best part? Almost zero overhead and way better uptime than Langsmith’s constant issues.