I’m running a Twitch bot using Node.js to moderate our channel’s chat. We’ve got a problem with spam raids. These bots flood the chat with weird messages. They’re usually in Russian or use odd character replacements.
This method isn’t great. It only works after we’ve seen the spam and added new keywords. Is there a better way to catch these messages before they flood the chat? Maybe some pattern recognition?
yo, i had the same issue. try using regex for weird character patterns like /[\u0400-\u04FF]{3,}/ for russian. also, track message frequency. if a user sends like 5+ msgs in 10 secs, probly spam. u could use a scoring system too, where sus stuff adds points. when it hits a threshold, mute em
I’ve dealt with similar spam issues on my Twitch channel, and I found that relying solely on keyword filtering isn’t enough. What worked well for me was implementing a multi-layered approach.
First, I’d suggest using regular expressions to catch patterns like repeated characters or unusual Unicode. Something like:
/([^\s]{10,}|(.)\2{5,})/
This catches long strings without spaces and repeated characters.
Additionally, consider implementing a rate limit for new accounts or first-time chatters. Most spam bots will trigger this immediately.
Lastly, machine learning models can be incredibly effective. There are some open-source projects that use natural language processing to identify spam patterns. It might be worth looking into integrating one of these into your bot.
Remember, it’s an ongoing battle. Keep updating your methods as spammers evolve their tactics.