Is there a way to fix a Discord bot that doesn't recognize other users' speech?

Liam_25Meditation · April 18, 2025, 10:35am

Hey everyone! I’m having trouble with my Discord bot. It’s supposed to listen to what people say in voice channels and move them if they use a certain word. But it only works when I talk. It doesn’t pick up other people’s voices at all.

I’ve tried different libraries and methods, including Discord’s audio features and FFmpeg. Nothing seems to work right. I even added a console output to see what words are being detected, but it only shows my own speech.

Here’s a simplified version of my code:

import discord
import vosk
import asyncio
import pyaudio

bot = discord.Bot()
model = vosk.Model('path/to/model')
audio_queue = asyncio.Queue()

async def process_audio(vc):
    recognizer = vosk.KaldiRecognizer(model, 16000)
    while vc.is_connected():
        audio = await audio_queue.get()
        if recognizer.AcceptWaveform(audio):
            result = recognizer.Result()
            print(f"Heard: {result}")
            if 'target_word' in result:
                # Move users

@bot.event
async def on_voice_state_update(member, before, after):
    if after.channel and bot.user not in after.channel.members:
        vc = await after.channel.connect()
        asyncio.create_task(process_audio(vc))

bot.run('YOUR_TOKEN')

Any ideas on how to make it recognize everyone’s speech? Thanks in advance!

John_Clever · April 26, 2025, 4:51pm

I’ve encountered this problem before. The issue likely stems from how Discord handles audio streams. By default, bots only receive the combined audio of all users, not individual streams. To overcome this, you’ll need to implement a solution that separates individual user audio.

Consider using a library like discord-speech-recognition, which is designed to handle multi-user speech recognition for Discord bots. It integrates well with discord.py and can differentiate between speakers.

Alternatively, you could explore using Discord’s API to get raw voice data and process it separately. This approach is more complex but offers greater control over audio processing.

Remember to thoroughly test your solution across different servers and user configurations to ensure reliability.

Nova56 · April 23, 2025, 2:55pm

Hey Liam, I feel your frustration. I’ve been there with my own Discord bot projects. One thing that really helped me was switching to the discord.py library and using its voice_client.listen() method. It’s designed to capture audio from all users in a voice channel, not just the bot owner.

Also, double-check your bot’s permissions on the server. It needs ‘Voice Activity’ permission to hear other users. I once spent hours debugging only to realize I’d forgotten to grant that permission!

If you’re still having trouble, you might want to look into using a dedicated speech recognition service like Google Cloud Speech-to-Text API. It’s more robust for multi-user environments. Just be aware it can get pricey if your bot is in many servers.

Keep at it, and don’t hesitate to ask for more help if you need it. Discord bot development can be tricky, but it’s super rewarding when you get it working!

amelial · April 23, 2025, 11:11am

hey there! i’ve had similar issues before. have u checked ur bot’s permissions? sometimes it needs specific voice perms to hear others. also, make sure ur using the right audio sink in ur code. discord.py’s voice_client.listen() might help. good luck fixing it!