Hey everyone! I’m working on a Google Sheets project and I’m stuck. I’ve got a column with text entries separated by commas. What I’m trying to do is figure out the fraction of occurrences for each string in the column. Here’s what I mean:
Column A
dog,cat
dog
fish,fish,bird
dog
cat
cat,dog,bird,bird
I need another column to show:
Column B
0.5
1
0.33
1
1
0.25
I’m not sure what’s the best way to do this. Should I use a QUERY function with SQL? Or is there an easier method?
Right now, I’m thinking something like:
=QUERY(IF(A1 contains ',' then (length(A1)-length(SUBSTITUTE(A1,',',''))) else 1))
But I don’t think that’s quite right. Any ideas on how to tackle this? Thanks in advance for your help!
hey ethan99, i think ur on the right track w/ QUERY but it might be overkill. try this:
=1/COUNTA(SPLIT(A1,‘,’))
this splits the text by commas, counts unique entries, then divides 1 by that count. should work for single or multiple entries. lemme know if it helps!
I’ve actually dealt with a similar issue in my work. Here’s what worked for me:
=ARRAYFORMULA(1/LEN(SUBSTITUTE(A1:A, ‘,’, ‘’))+1)
This formula counts the commas and adds 1 to get the total number of items, then calculates the fraction. It’s an array formula, so it’ll automatically fill down the entire column.
One advantage of this method is that it’s faster for large datasets compared to SPLIT, which can be resource-intensive. It also handles empty cells gracefully.
Just a heads up though - if there are any cells with double commas or trailing commas, you might need to clean your data first. Let me know if you run into any issues!
I’ve encountered this problem before in my data analysis work. Here’s a solution that might help:
=1/ARRAYFORMULA(LEN(A1:A)-LEN(SUBSTITUTE(A1:A,“,”,“”))+1)
This formula counts the number of commas in each cell, adds 1 to get the total number of items, and then calculates the fraction. It’s efficient and works across the entire column without needing to be dragged down.
One thing to watch out for: if you have any empty cells in your column, this formula will return an error for those. You might want to wrap it in an IFERROR function to handle that case.
Also, make sure your data doesn’t have any extra spaces around the commas, as that could throw off the count. Hope this helps!