I’m working with Airtable and need to pull out just the numeric part from URLs using regex. I have a field called {Website Link} that contains URLs like this:
The problem is some URLs have extra stuff after the numbers like https://target-shop.marketplace.com/category?ref=8291057462&source=email&campaign=3847. I only want the digits right after the equals sign and nothing else. What’s the right way to do this in Airtable?
You’re overcomplicating this. Ditch the IF statement and SUBSTITUTE entirely. Just use REGEX_EXTRACT({Website Link}, "=(\d+)") - it’ll grab only the digits after the equals sign.
I hit this exact issue building a campaign URL tracker. The trick is using capturing groups with parentheses around \d+. It grabs just the number and ignores everything else - the equals sign, any parameters after it, whatever.
This works great with messy URLs that have multiple parameters since the regex stops at the first non-digit character. No extra string manipulation needed.
If you want more Airtable regex patterns, this video covers the most useful ones:
Your formula gets way simpler and handles edge cases better.
Your regex needs to capture just the digits, not the whole equals-plus-digits string. Use REGEX_EXTRACT({Website Link}, "=(\d+)") instead. The parentheses grab only the numbers without the equals sign - no SUBSTITUTE function needed. This works with URLs that have extra parameters after the numbers since \d+ stops at non-digit characters like ampersands. I’ve used this exact pattern for URL parsing before and it’s way cleaner than stripping characters afterward.
Your approach is backwards - you’re matching the whole equals-digits pattern then trying to strip out the equals sign. Just use a positive lookbehind instead: REGEX_EXTRACT({Website Link}, "(?<=\=)\d+"). This finds digits that come after an equals sign without actually capturing the equals. The lookbehind does all the work, so you get clean numbers without needing SUBSTITUTE. It automatically handles URLs with trailing parameters since the regex stops at non-digit characters. I’ve used this tons of times for pulling query parameters - works great.