Discord bot failing to detect other users' speech in voice channels?

Hey everyone, I’m having trouble with my Discord bot. It’s supposed to listen for certain keywords in voice chats and move people to another channel when they say them. The weird thing is, it only works when I talk. It doesn’t pick up on what anyone else is saying.

I’ve tried different libraries and methods, including Discord’s audio features and FFmpeg. Nothing seems to work right. I even added a console output to see what words were being detected, but it only shows my own speech.

Here’s a simplified version of my code:

import discord
import vosk
import asyncio
import pyaudio

bot = discord.Bot(intents=discord.Intents.default())
model = vosk.Model('path/to/model')

async def listen_to_vc(voice_channel):
    recognizer = vosk.KaldiRecognizer(model, 16000)
    stream = pyaudio.PyAudio().open(format=pyaudio.paInt16, channels=1, rate=16000, input=True)
    
    while voice_channel.is_connected():
        audio = stream.read(4000)
        if recognizer.AcceptWaveform(audio):
            text = recognizer.Result()
            print(f"Heard: {text}")
            if 'keyword' in text:
                await move_users(voice_channel)

@bot.event
async def on_voice_state_update(member, before, after):
    if after.channel and bot.user not in after.channel.members:
        vc = await after.channel.connect()
        asyncio.create_task(listen_to_vc(vc))

bot.run('YOUR_TOKEN_HERE')

Any ideas on why it’s not detecting other users’ speech? I’m really stuck here.

I’ve encountered a similar issue before, and it turned out to be related to Discord’s voice receive permissions. Make sure your bot has the ‘Voice Activity’ intent enabled in the Discord Developer Portal. Also, check if the bot has the necessary permissions in the server settings, particularly ‘View Channels’ and ‘Connect’ for voice channels.

Another potential cause could be the audio input configuration. Your code uses pyaudio to open the stream, but it might be defaulting to your local microphone instead of Discord’s audio. Try exploring Discord’s voice_client methods for receiving audio, like voice_client.listen(). This approach might be more reliable for capturing all users’ speech in the channel.

Lastly, consider using a different speech recognition library that’s more compatible with Discord’s audio format. Vosk is great, but something like speech_recognition might integrate more smoothly with Discord.py.

hey mate, had the same problem. check ur bot’s permissions in server settings. make sure it can view and connect to voice channels. also, try using discord.py’s voice_client.listen() instead of pyaudio. it worked better for me catching everyone’s voice. good luck!

I’ve dealt with this exact problem before, and it can be super frustrating. One thing that often gets overlooked is the audio subsystem Discord uses. By default, it might not be capturing all voice streams correctly.

Try adding the ‘voice’ intent when initializing your bot:

intents = discord.Intents.default()
intents.voice_states = True
intents.message_content = True
bot = discord.Bot(intents=intents)

Also, make sure you’re using the latest version of discord.py. They’ve made improvements to voice processing in recent updates.

Another trick that worked for me was to use a voice sink instead of pyaudio. It’s more reliable for capturing all users:

class VoiceSink(discord.AudioSink):
    def __init__(self):
        self.chunks = []

    def write(self, data):
        self.chunks.append(data)

sink = VoiceSink()
vc.play(discord.FFmpegPCMAudio(source), after=lambda e: print('Player error: %s' % e) if e else None)
vc.source = discord.PCMVolumeTransformer(vc.source)
vc.source.volume = 0.6

This approach has been more consistent in picking up everyone’s voice in my experience. Hope it helps!