I’m building an application that needs to monitor token usage for different user projects. My initial approach was to use LangSmith traces and add project_id to the metadata so I could retrieve usage data for all runs associated with that specific field.
This seemed like a solid solution until I realized that when users delete their projects, the connection between user projects and the project_ids in LangSmith gets broken.
I’m looking for suggestions on how to handle this properly. One option I’m considering is storing the total_tokens in my local database after each API call.
Also wondering about LangGraph agents - is there a way to capture tokens consumed when calling tools within agent workflows?
Token tracking gets messy quick, especially with agent workflows. I’ve learned middleware between your app and LLM providers beats post-call logging every time. Middleware catches everything - direct API calls, agent tool runs, streaming responses - before hitting your app logic. You get a complete audit trail no matter what happens to projects or external tracking. With LangGraph, instrument at the graph level, not individual tools. Build a custom graph wrapper that watches token usage across the whole execution. This grabs tokens from tool calls, reasoning steps, and any intermediate API requests your agent makes. For project deletions, try a token ownership transfer system. When projects die, tokens don’t vanish - they move to a “deleted projects” bucket under the same user. Keeps billing accurate and users can see historical usage. One gotcha: some LLM providers report different token counts for identical requests depending on load balancing. Always validate against multiple sources instead of trusting one.
Been there with token tracking nightmares. Local storage works but gets messy when you scale.
Automating the whole pipeline transformed our token monitoring. Instead of manually inserting records after each call, I built a workflow that captures tokens in real-time, runs them through validation rules, and syncs with billing systems automatically.
Best part? Edge cases get handled without code changes. Failed API calls, partial responses, bulk operations - configurable logic flows handle it all. When LangGraph agents make tool calls, automation catches every token event and routes it correctly.
For project deletion, the workflow keeps token history even after projects vanish. Users get accurate billing and you keep audit trails.
Daily reconciliation runs automatic comparisons with provider reports. Discrepancies trigger alerts and correction workflows. No more manual checking or mystery charges.
You can add new token sources or change business rules without touching application code. Everything flows through the automation layer.
Store tokens locally after each API call - learned this the hard way when external tracking broke our billing system.
What works for me:
Simple tokens table: user_id, project_id, timestamp, input_tokens, output_tokens, cost. Insert after every API response. You control everything and it survives project deletions.
For LangGraph agents, hook into the callback system. Custom callback handler captures token usage from tool calls. The framework passes token counts through callbacks - just intercept and store.
Watch out for streaming responses. Token counts sometimes come in the final chunk instead of accumulating throughout the stream.
Run a daily reconciliation job comparing your counts with the API provider’s usage reports. Catches missed calls or double counting.
Database storage is definitely the way to go, but think about data retention policies early. I learned this the hard way after millions of token records started slowing down my queries. For project deletion, use soft deletes instead of hard deletes. Add a deleted_at timestamp to projects - you can still access historical token data for billing while hiding deleted projects from users. For LangGraph agents, grab token usage from the run metadata after each tool execution. The trick is intercepting tool results before they move to the next step. I wrap my tool calls in a decorator that pulls usage stats and writes them to the database immediately. One thing nobody’s mentioned - add idempotency keys to your token logging. Network issues cause duplicate API calls, and you don’t want to double-count tokens. Use request IDs or generate unique hashes from call parameters to prevent duplicates.