Sunday, October 20, 2013

Selectively sum substring value in a cell/row given condition on substring values in cell/row

Consider this problem, you have a row of strings that you would like to sum given that the string is in a certain format. For example in cell B5 to AF5 contains the following rows of data,

0.5AL,,,,1.0BL,1.0BL,1.0BL,1.0BL,1.0BL,1.0BL,1.0BL,1.0BL,1.0BL,1.0BL

and we would like to sum up all the ALs and BLs. The limitation here is to do the sum in 1 line i.e. to have the sum ready in just one cell and not to create additional rows. We will not be able to do it with Excel provided functions like sumif" and  "countif".  Typically, I would provide the explanation before the solutions. But in this case, the solutions will be given and explained in greater detail as to why Excel provided methods are not useful in this case.

SUM(IF(RIGHT(B5:AF5,2)="BL",1,0)*(VALUE(IF(LEFT(B5:AF5,3)="",0,(LEFT(B5:AF5,3))))))

Note that there are a few issues that complicates the problem:

  1. There are empty cells in this row.We could do a VALUE(LEFT(B5:AF5,3)) to get the numerical value. But we will get an error if the cell is empty. 
  2. The cells contain strings and not numbers. The typical "sumif" applies conditions and sum on the same range. This is not what we want because we want to condition on the substring instead of the whole string.
To solve issue #1, we use the "IF" statement to assign a 0 value if the cell is empty.
VALUE(IF(LEFT(B5:AF5,3)="",0,(LEFT(B5:AF5,3))))

This will give us the numerical values. To solve issue #2, we need to apply logical conditions on summing and this can be done by using
IF(RIGHT(B5:AF5,2)="BL",1,0)

This will give us the logical condition of 1 if the condition is met and 0 otherwise.

By multiplying the two we will have a range of values conditioned on the existance of the substring "AL" or "BL" and we can get the sum by adding them up. This made me interested on including this feature in functions.

I was asking around for alternative solutions here are some alternative solutions:

1.     An improved version can be done with the following formula:
 SUM(IF(RIGHT(B5:AF5,2)="BL","BL",VALUE(IF(LEFT(B5:AF5,3))),0))

2.     As pointed out by one of my friends, he suggested using SQL to do this. The pseudocode is as follows:

select substr(column, 3,2), sum(column, 0,3) from table group by substr(column, 3,2);