PHP: Handling Large HTML Content for OpenAI API Translation

I’m working on a project that needs to translate HTML content using the OpenAI API in PHP. The tricky part is keeping all the formatting intact. Here’s what I’m dealing with:

<p><strong data-custom="key">Hello</strong> and welcome to our <span class="special">awesome site</span>. 
Check out our <a href="#">rules</a> for more info.</p>

Right now, I’m sending the whole HTML chunk to OpenAI for translation. But when the content gets too big, I hit the token limit. I’ve tried stripping out the HTML and just sending plain text, but that messes up the formatting when I put it back together.

Has anyone figured out a good way to break up big HTML chunks for translation without wrecking the structure? I’m stumped and could really use some advice. Thanks!

hey scarlettturner, i tried splitting big html into chunks based on tags like

and

.
translate each smaller section then reassemble.
its not perfect, but it evades token issues without wrecking formatting. hope it helps!