I’m trying to validate my SQL query generation system using Langsmith evaluation framework but keep hitting a TypeError. The system works fine on its own but fails during evaluation.
The error happens when I run dataset evaluation with custom evaluators. It complains about an unexpected keyword argument called ‘evaluator_run_id’ that the evaluate_run method doesn’t recognize.
Hit this nightmare last month updating our eval pipeline. Langsmith’s evaluation runner always passes evaluator_run_id to every evaluator, even if they don’t support it. You need to update your correctness_checker to accept this parameter. If it’s a class inheriting from DynamicRunEvaluator, change the method signature:
def evaluate_run(self, run, example, evaluator_run_id=None):
# your evaluation logic stays the same
return your_evaluation_result
You don’t actually need to use evaluator_run_id - just accepting it stops the TypeError. Langsmith uses it internally for tracking but most custom evaluators can ignore it. If you’re using the @evaluate decorator, it should handle this automatically, but double-check your function signature includes the parameter or uses flexible arguments. This broke our production evals until we patched all our custom evaluators.
Been there. API versioning headaches like this are exactly why I ditched these evaluation frameworks and switched to Latenode.
Instead of debugging langsmith conflicts, I built my SQL evaluation pipeline there. It connects to my database, runs generated queries, compares results against expected outputs, and logs everything to a spreadsheet automatically.
The workflow’s simple: trigger evaluation → execute SQL → validate results → store metrics. No dependency hell, no weird parameter errors. Just works.
I can set it to run evaluations on schedule or trigger from my CI pipeline. Way cleaner than managing langsmith versions and custom evaluator classes.
This issue arises when your custom evaluator is not in line with the latest LangSmith framework. The framework recently modified its evaluation method to include additional parameters like evaluator_run_id, which your correctness_checker may not have been designed to handle. To resolve this, you should modify your custom evaluator’s evaluate_run method to accept the new parameter, even if you don’t utilize it. If your implementation derives from DynamicRunEvaluator, include evaluator_run_id=None in your method signature or use **kwargs to accommodate any future parameters. I encountered this similar problem during my evaluation code migration, and adding that parameter resolved it immediately.
Hit this exact problem 3 months ago migrating our query validation system. Langsmith changed their evaluator interface without proper backward compatibility.
What fixed it for me - modify your correctness_checker evaluator to handle extra parameters:
class Correctnesschecker(DynamicRunEvaluator):
def evaluate_run(self, run, example, **kwargs):
# your existing evaluation logic here
# kwargs will catch evaluator_run_id and any future params
pass
The **kwargs approach is bulletproof since langsmith keeps adding random parameters without warning. Learned this after they broke my evaluators three times.
If you’re using function-based evaluators, wrap with @evaluate decorator and make sure it accepts **kwargs too.
Upgrading langsmith might work but I’ve seen it break other stuff. The kwargs solution keeps you compatible with old and new versions.
sounds like a version mismatch. had this exact problem last week - your langsmith version’s probably outdated. run pip install --upgrade langsmith first. the evaluator_run_id parameter only exists in newer versions, so older evaluators choke on it.