New Study Reveals AI Support Bots Struggle with Basic Customer Service Tasks

I’ve been looking into predictions that AI chatbots will replace customer service roles soon, but recent research left me questioning that idea.

Researchers have published findings indicating that AI customer service agents perform poorly. They only succeed about 58% of the time when answering straightforward questions, and for situations where customers need to ask several follow-up questions, the success rate drops to roughly 35%.

What caught my attention is that these AI systems fail most at the tasks they’re designed to excel in. For simple inquiries like checking the status of an order or resetting a password, we already have effective solutions that don’t rely on pricey AI.

The study also uncovered alarming issues, such as AI agents inadvertently disclosing private customer details from databases. One specific model didn’t gather all necessary information more than half the time before attempting to assist customers.

A separate paper from another organization reported similar findings, stating that when these AI systems make a mistake or overlook crucial information, they are unable to rectify it during the interaction.

I’m curious to hear others’ perspectives on this. Companies are aggressively trying to substitute human customer service with AI, but if these systems prove so unreliable, wouldn’t that just leave customers feeling frustrated? And if AI can’t manage basic support tasks effectively, how can we trust it with more significant decision-making?

The study results don’t surprise me honestly. I work in tech support and we’ve been testing various AI solutions for the past year. The biggest issue I’ve noticed is that AI bots are terrible at understanding context switching within conversations. A customer might start asking about billing, then mention a technical issue, and the bot just gets completely lost trying to juggle both topics. Human agents naturally handle this kind of conversational flow, but AI systems seem to treat each input as isolated rather than part of an ongoing dialogue. The security concerns you mentioned are particularly worrying because once customer data gets exposed, that’s not something you can easily fix. Companies are rushing to implement these systems without properly stress-testing them in real-world scenarios where customers don’t follow perfect scripts.