I’m working on a project that needs to translate HTML content using the OpenAI API in PHP. The tricky part is keeping all the formatting intact. Here’s what I’m dealing with:
<p><strong data-custom="key">Hello</strong> and welcome to our <span class="special">awesome site</span>.
Check out our <a href="#">rules</a> for more info.</p>
Right now, I’m sending the whole HTML chunk to OpenAI for translation. But when the content gets too big, I hit the token limit. I’ve tried stripping out the HTML and just sending plain text, but that messes up the formatting when I put it back together.
Has anyone figured out a good way to break up big HTML chunks for translation without wrecking the structure? I’m stumped and could really use some advice. Thanks!