I need help with parsing a string that has HTML markup mixed in. I want to break it down so that regular text gets split into individual letters, but HTML tags stay intact as complete elements.
let text = 'hel<em class="highlight">lo wo</em>rld';
console.log("parsing result: " + JSON.stringify(text.split(/(<[^>]*>)|/)));
This approach gives me:
["h",null,"e",null,"l","<em class=\"highlight\">","l",null,"o",null," ",null,"w",null,"o","</em>","r",null,"l",null,"d"]
After removing the null values, I get the desired output:
["h","e","l","<em class=\"highlight\">","l","o"," ","w","o","</em>","r","l","d"]
Is there a better regex pattern that can handle this parsing without creating those null entries that I have to clean up later?