I’m working on a chat application built with a React frontend and Flask backend. I’ve integrated Langsmith to evaluate my bot’s responses, but now I need to pull the evaluation data and show it in my dashboard.
Is there a way to fetch performance statistics from Langsmith through REST endpoints? I’ve been looking through their documentation but haven’t found clear examples of how to access this data programmatically.
import requests
def fetch_bot_metrics():
# Need to implement API call to get Langsmith data
endpoint = "https://api.langsmith.com/metrics"
headers = {"Authorization": "Bearer YOUR_TOKEN"}
response = requests.get(endpoint, headers=headers)
return response.json()
Has anyone successfully retrieved evaluation results from Langsmith using their API? What’s the correct approach for this?
Pulling metrics from multiple endpoints and doing manual correlation sucks. Been there when I was building chatbot dashboards.
Skip wrestling with Langsmith’s API - I built a Latenode workflow that does this automatically. It grabs data from any endpoints you need, handles the correlation, and feeds clean metrics straight to your dashboard.
Set it once, forget it. No more manual API calls or matching runs to feedback. Latenode transforms the data and aggregates however you want.
Mine runs hourly to keep dashboards current. Way better than Python scripts that break whenever Langsmith updates.
Honestly, the LangChain Python SDK beats raw REST calls every time. Just from langsmith import Client then client.read_runs() with filters - you get everything without fighting endpoints. Way less headache than manual API work.
yup, had the same issue. u should switch to /runs and maybe use query params for better filtering by project. also, double-check your API key; it needs to be exact from the langsmith settings!
Been there before with Langsmith integration. The metrics aren’t at a single endpoint like you’d expect.
I had to combine data from multiple calls to get what I needed for our internal dashboard. First grab runs with /runs endpoint, then pull feedback separately from /feedback. The tricky part is correlating them.
Here’s what actually worked for me:
def fetch_bot_metrics(project_id):
base_url = "https://api.smith.langchain.com"
headers = {"Authorization": f"Bearer {api_key}"}
# Get runs first
runs_response = requests.get(
f"{base_url}/runs",
headers=headers,
params={"project_id": project_id, "limit": 100}
)
# Then feedback for those runs
feedback_response = requests.get(
f"{base_url}/feedback",
headers=headers,
params={"run_ids": [r["id"] for r in runs_response.json()]}
)
Note the base URL is different than what you have. Also make sure your API key has the right scope for reading evaluation data.
Aggregating the metrics yourself is annoying but gives you more control over what shows up in your dashboard.
Use the /runs/stats endpoint - it gives you all the aggregated metrics without having to piece together multiple API calls. I ran into this same problem building my Flask dashboard and this solved it. Just pass your project_id and you’ll get computed averages, success rates, and evaluation scores in one response. Way cleaner than pulling individual runs and feedback separately. The endpoint’s poorly documented but it handles everything server-side. Double-check you’re hitting https://api.smith.langchain.com and that your API key has analytics permissions turned on in Langsmith console.
Hit this same issue last month building analytics for our customer service bot. Langsmith’s API docs are terrible - they don’t tell you which endpoints actually have the evaluation data.
I ended up using /projects/{project_id}/runs with include_stats=true. Gets you run details and evaluation metrics in one call instead of hitting multiple endpoints.
Here’s the catch though - you won’t see any evaluation data unless you’ve set up evaluators in your Langsmith project first. Took me way too long to figure that out. No evaluators = empty metrics no matter what endpoint you use.
One more thing - their rate limiting is brutal. Had to add exponential backoff to my Flask service because requests kept getting throttled. Definitely cache the results since evaluation metrics barely change anyway.
wrong endpoint - use /sessions or /feedback instead. the sessions endpoint worked for me when pulling eval data. don’t forget the project_id query param or you’ll get a mess of everything.
Just dealt with this and had to dig around for hours. You want the /runs endpoint but need specific query parameters for aggregated metrics. /runs/query with POST works way better than GET for complex filtering. Include your project ID and specify which evaluation metrics you want in the request body. Their docs are trash on this - I used the network tab in their web interface to see what API calls they actually make. Also check your API key permissions. Most keys are read-only by default and won’t let you access evaluation data.