I’m facing an issue with a simple find and replace task using the Google Docs API. Specifically, I’m using the replaceText function along with some regex and running into problems.
My goal is to replace placeholders formatted like this: #replace this please#
Currently, I’m using the regex pattern: (\W|^)#replace this please#(\W|$)
This pattern works fine when there’s only one placeholder in a line. However, it fails to match when there are multiple placeholders on the same line.
For instance, when my text includes: #replace me please# and some normal text here #replace me too#
Neither of the placeholders gets replaced. I suspect my regex pattern isn’t accounting for multiple instances in one line, but I’ve had trouble finding clear documentation on this.
Can someone offer guidance or insight into this issue?
Your capturing groups are consuming the boundary characters. When the regex matches the first placeholder, it also grabs the word boundary character, which leaves nothing for the next match.
I encountered this same problem while developing a contract generator that required swapping multiple variables within the same paragraph. The Google Docs API processes replacements sequentially, and your current pattern leads to a “boundary consumption” issue.
Instead, you should use \b#replace this please#\b. The \b notation indicates word boundaries without capturing them, allowing the engine to identify consecutive matches. This approach is more effective with the Docs API compared to lookahead/lookbehind, which can be inconsistent based on the API version.
For greater precision regarding boundaries, you can also use (?<!\w)#replace this please#(?!\w), which eliminates word characters without the complexity associated with word boundaries.
Had the same headache with Google Docs API last month. Just use #replace this please# without the boundary stuff - their API handles word boundaries differently than regular regex engines. Sometimes simpler patterns work better with their setup.
Your regex is matching overlapping text, which breaks multiple matches on the same line. The pattern (\W|^)#replace this please#(\W|$) grabs the word boundary characters as part of the match, so the engine can’t find the next instance.
I hit this exact issue last year with a document templating system. Use lookahead and lookbehind instead: (?<=\W|^)#replace this please#(?=\W|$). This checks for word boundaries without including them in the actual match.
The (?<=\W|^) makes sure there’s a non-word character or line start before your placeholder. The (?=\W|$) checks for a non-word character or line end after it. Now the regex engine can find multiple instances on the same line.
This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.