How to match similar looking characters with different Unicode values in JavaScript

I’m trying to check if a pattern exists within a text string, but I’m running into issues with characters that look identical but have different Unicode values.

For example:

  • Pattern: D (Unicode: 68)
  • Text: Dog where the first character is actually Unicode 1044

Both characters appear as the letter “D” visually, but text.includes(pattern) returns false since they have different Unicode codepoints.

const pattern = 'D';
const text = 'Dog'; // First char is Cyrillic D (1044)

console.log(pattern.codePointAt(0)); // 68
console.log(text.codePointAt(0)); // 1044
console.log(text.includes(pattern)); // false

I believe these are called homoglyphs - characters that look the same but are encoded differently. Is there a native JavaScript method to handle this comparison, or do I need to implement a custom solution? I’d prefer not to maintain a large mapping table if possible.

JavaScript doesn’t have built-in homoglyph detection, but you can use the Intl.Collator API as a partial workaround. Set it to ignore case and accents - it’ll catch some visually similar characters:

const collator = new Intl.Collator('en', { sensitivity: 'base' });
const pattern = 'D';
const text = 'Dog'; // Cyrillic D

console.log(collator.compare(pattern, text[0]) === 0); // May return true

This won’t work well for characters from different scripts (Cyrillic vs Latin). For proper homoglyph detection, you’ll need a dedicated library like confusable-homoglyphs or build your own mapping table. The Unicode Consortium has a confusables list that’s perfect for creating these mappings.

ugh, this is such a pain. i dealt with something similar for user input validation. i used string normalization first - normalize('NFD') to decompose characters, then stripped accents. but that won’t help with cyrillic/latin mix. try converting both strings to lowercase ascii equivalents before comparing? there are some npm libs for this but i can’t remember the names right now.