I’m trying to make my Google Sheets formula more flexible. Right now, I’ve got this:
=query('data set',"select E where B contains '"&B37&"'",0)
The problem is that my data set might have small differences in the text I’m searching for. I want the formula to catch these variations too.
Is there a way to add some kind of fuzzy matching or wildcard search to this query? I’ve heard about wildcards but I’m not sure how to use them here. Any ideas on how to make this work?
While the previous suggestions are good, I’ve found another approach that might work for your fuzzy search needs. You could try using the LIKE operator in your query, combined with wildcards. Here’s an example:
=QUERY(‘data set’, “SELECT E WHERE B LIKE '%” & SUBSTITUTE(B37, " ", “%”) & “%'”, 0)
This method replaces spaces in your search term with ‘%’, which acts as a wildcard in SQL-like queries. It allows for characters before, between, and after your search terms, making it more flexible. It’s not as powerful as true fuzzy matching, but it’s native to Google Sheets’ query function and can catch many variations without complex regex. Give it a try and see if it improves your results.
I’ve faced similar challenges with Google Sheets queries, and I’ve found a workaround that might help. Instead of using ‘contains’, try using ‘matches’ with a regex pattern. Something like this:
=query('data set', "select E where B matches '(?i).*" & SUBSTITUTE(B37, " ", ".*") & ".*'", 0)
This approach creates a case-insensitive regex pattern that allows for partial matches and variations. The ‘(?i)’ makes it case-insensitive, and replacing spaces with ‘.*’ allows for any characters between words.
It’s not perfect fuzzy matching, but it’s more flexible than ‘contains’. You might need to tweak it based on your specific data, but it’s a good starting point. Hope this helps!
hey Samuel, have u tried using the REGEXEXTRACT function? it’s not exactly fuzzy search, but it can handle some variations. something like this might work:
=QUERY(‘data set’, "select E where REGEXEXTRACT(B, ‘"&B37&"’) is not null", 0)
it’s not perfect, but could catch some minor differences. worth a shot!