Using Excel Formulas to Rank Ordered Lists
March 6, 2010 5:14 PM   Subscribe

Excel question: I have a list of values, sorted from lowest to highest. I want to rank them. But I have a specific issue with regard to ties, and I'd like to be able to craft an Excel formula which will deal with this automatically, if possible. Full explanation below the fold.

I want to be able to rank an ordered list that has n values. The sum of those ranks will, of course, be equal to n(n+1)/2. If my list has any tied values, though, the sum of the ranks will not ordinarily equal n(n+1)/2, since typically you repeat ranks when there are ties. However, I want to rank things such that my ranks still add up to n(n+1)/2, even if there are ties.

If this is unclear, please allow me to explain further. Imagine a list of the following values:

3
4
7
10
12

If you rank them, it looks like so:

3 - 1
4 - 2
7 - 3
10 - 4
12 - 5

The total ranks for this list add up to 15 (n=5). But now let's say the list looks like this:

3
4
4
10
12

One common way to rank them is as follows:

3 - 1
4 - 2
4 - 2
10 - 4
12 - 5

My problem is that I still want the total value of all the ranks to add up to 15, even though there are ties. Using the above system for ties, though, you wind up with a total of 14. One way to get around that problem is to rank them like so:

3 - 1
4 - 2.5
4 - 2.5
10 - 4
12 - 5

Then you get 15. But this is only the simplest case. What if you have three ties in a row? Such as:

3
4
4
4
12

To preserve the 15, you could rank all those 4s in the middle as 3s. (That gives you 1+3+3+3+5 = 15.) But I'm looking for a generalized answer that works in all situations (if there is one).

So let's say I have a much longer list - 100 items long, with multiple multi-way ties, and also with multiple different ties in a row, too (say the values at one point go 3, 4, 4, 4, 7, 7, 7, 7, 9 for example). Is there a general solution to this problem, regardless of list length, number of ties, and length of ties? Thanks for your help.
posted by Conrad Cornelius o'Donald o'Dell to Computers & Internet (8 answers total) 1 user marked this as a favorite
 
To preserve the 15, you could rank all those 4s in the middle as 3s. (That gives you 1+3+3+3+5 = 15.) But I'm looking for a generalized answer that works in all situations (if there is one).

Take the average value of the ties and assign that. For example:

3 - 1
4 - 2
4 - 3
4 - 4
12 -5

The average is (2+3+4)/3 = 3. So that's why you assign a 3. Also why you used 2.5 above.

Note: you re-calculate the average for every set of ties.
posted by sbutler at 5:31 PM on March 6, 2010


Best answer: Excel's RANK function help page suggests adding a "correction" factor. For example, if I put your numbers in cells A1:A5, I'd use put this formula in B1 and copy down through B5:

=(COUNT(A$1:A$5)+1-RANK(A1,A$1:A$5,0)-RANK(A1,A$1:A$5,1))/2+RANK(A1,A$1:A$5,1)

So for the case of 3,4,4,10,12, I'd get the following result for 4:

= (5 + 1 - 3 - 2) / 2 + 2 = 2.5
posted by mullacc at 5:33 PM on March 6, 2010


Using this:(say the values at one point go 3, 4, 4, 4, 7, 7, 7, 7, 9 for example), here's how I did it:

Put these values in column A (cells A1:A9).

In cell B1, put a 1.
In cell B2, put =B1+1
Copy cell B2 to B3:B9

You should now have consecutive numbers 1 through 9 in column B.

In cell C1, put =B1
In cell C2, put =IF(A2=A1,C1,B2)
Copy cell C2 to C3:C9

You should now have rank numbers in column C, with "ties" properly represented.

In cell D1, put =SUMIF(C$1:C$9,C1,B$1:B$9)/COUNTIF(C$1:C$9,C1)
Copy cell D1 to D2:D9

This should be the ranking numbers you actually want in column D.

What it's doing looking at how many rows there are with the same rank, and adding up the values from column B for all those rows to find out how many "points" need to be distributed among those rows to still get the right total. Divide total points required by number of rows with that number of points, and that's how many go onto each row.
posted by FishBike at 5:41 PM on March 6, 2010


Response by poster: Sbutler: Thanks - can't believe I didn't see that, but that's very helpful.

Mullacc: Can you explain a bit more about how the "correction" factor works? I can definitely get your formula to work (thanks!), but I'm not sure I understand it.
posted by Conrad Cornelius o'Donald o'Dell at 5:47 PM on March 6, 2010


Er... I simply don't understand. You have them sorted in order, but you want to rank them? Sorted means they're in order of rank, from beginning to end, right? If I were you, I'd just use =ROW(). That just prints the row number. And you wouldn't have any trouble with futzing around with fudging the rank that way, right?

If you are intent on doing this with rank, you could do it like this – though again, forgive me, I don't think I understand the question at all – with your values in cells A1:A100 (for example) and your ranks generated by this in column B:

=RANK(A1, A1:A100)
=IF(A2=A1, B1 + 1, (RANK(A2, $A$1:$A$100, 1)))
=IF(A3=A2, B1 + 1, (RANK(A3, $A$1:$A$100, 1)))
etc...


That just says "if the present value is equal to the last, just rank this after the last one; otherwise, just rank it normally." This works regardless of the length of the list, because it will result in a simple 1 2 3 4 5... ranking. But, of course, so would a =ROW() function.

I think you're focusing too much on how it all adds up. Of course it'll always add up that way - that's how ranked lists work. You're just wanting to rank things, right? So there ought to be an easier way to do that.
posted by koeselitz at 5:56 PM on March 6, 2010


Response by poster: Koeselitz: Row number doesn't work with ties. I want tied values to have the same rank. If I use row numbers, then tied value will have different ranks. I'm focusing on how it all adds up because if you repeat ranks (the normal solution), then it won't add up right.

I'm pretty sure that mullacc has it, though. Want to play around with it some more, though.
posted by Conrad Cornelius o'Donald o'Dell at 6:17 PM on March 6, 2010


Response by poster: Fishbike's solution works well, too, though I think mullacc's may be a bit simpler.
posted by Conrad Cornelius o'Donald o'Dell at 6:20 PM on March 6, 2010


Ah.
posted by koeselitz at 7:19 PM on March 6, 2010


« Older Publishing Old Pics & Old Toys?   |   'Metaphysics of Quality' taken seriously? Newer »
This thread is closed to new comments.