Tuesday, October 11, 2011

Using =split() to create column headers: A "dripping awesome" result

Asker Anbotero asked the following question on the Docs forum yesterday:

"I happen to have columns A and B, each filled with “X”s: 
A     BX
        X
X     X
X
X        X
X     X
How do I count the number of cells filled in A that are not in B, and vice-versa? There is this undocumented Google Docs function, but I couldn’t understand how it works: COUNTUNIQUE().

For A it should be 3, and for B just 2."


Adam provides a wonderful and quick answers for finding the answer for column A:

=ArrayFormula( SUM( ( A:A = "X" ) * ( B:B <> "X" ) ) )

This formula must be applied in a column that is not A or B because his ranges use open ends. If this formula was placed in a cell in column A or B the formula would error as a circular dependency. His formula basically counts the number of times there is an X in A and not in B which is exactly what Anbotero was trying to solve. To amend this for column B it would look like:
=ArrayFormula( SUM( ( B:B = "X" ) * ( A:A <> "X" ) ) ).

So, while Adam was dutifully providing this answer I was trying to figure out how to do this with one single formula. Here is what I came up with:

Click here to make your own copy to play with.

The formula:
=arrayformula(if(row(A1:B2)=1,split("UniqueCol1xUniqueCol2","x"),split(countif(arrayformula(if(A3:A=B3:B,1,if(A3:A="X",2,3))),"2")&"x"&countif(arrayformula(if(A3:A=B3:B,1,if(A3:A="X",2,3))),"3"),"x")))

I'm more interested in explaining the method as to drilling down the function but if you want to know feel free to write in the comments and I will expand further. The function creates a 2x2 table. The first row basically takes the string UniqueCol1xUniqueCol2 and "splits" it between A1 and B1. The function knows to split when it sees "x", the delimiter, in the string. UniqueCol1xUniqueCol2 becomes UniqueCol1 | UniqueCol2.
The function knows to do this because of the first condition in the if statement. In general, if you want to create a row with headers always use =row() for the first condition in an if statement. Set this equal to the row number where you want the column headers to begin. In this example, I want the column headers to appear in row 1. Therefore, I use =if(row(A1:B2)=1...

The same is done for the second row but I don't need to define this row because I'm only using 2 rows in my range. Therefore, if its row 1 put my headers, if its not then do my calculations. For the calculations, the method of using =split() is the same. I do the calculation for column A, join it with x and then join it with the calculation for column B. Before the split, the function ends up returning 3x4. Then =split() does the same with 3x4 as it did with UniqueCol1xUniqueCol2.

There you have it! Yogi also just posted another method on his blog. All of these different solutions truly show how robust Google Spreadsheets can be (or how addicted the Docs TCs are!).


Cheers!


PS: Adam's query solution also works! See Sheet2 on my spreadsheet for that example.

No comments:

Post a Comment