How to rank a Top100 fairly
June 21, 2005 11:24 PM   Subscribe

How do I create an accurate and fair Top 100 list from people voting?

A panel of 12 people is trying to rank the top 100 sportspeople. The scoring system means that if a panelist puts someone at number one on their list that sportsperson gets 100 points, if the sportsperson is ranked 2nd they get 99 etc. Each sportsperson will get a total score by adding up all of their points from each of the 12 panelists. These are then placed in order from highest score to lowest - creating a Top 100.

My worry is that if one panelist puts someone at 50 [49 points] then that sportsperson is higher than someone that 4 panelists put at 90 [4 x 9 = 36] and might be considered over-weighting that one persons point of view.

Should I be worried? Is there a fairer way to do it? And, please, if possible no answers that looks like calculus.
posted by meech to Grab Bag (12 answers total)
It's arbitrary - you can weight it to give whatever results you want. If you've decided that featuring in multiple people's list carries more weight than featuring highly in one person;s list, then alter the scoring to reflect this, eg A first place gets you 200 points, and a 100th place gets you 100 points.

Or start at 150 instead of 200. Just change the starting number, to select the kind of balance between individual judgement and group judgement that you decide you want.

It comes down to that - you have to decide what you consider to be the approporiate balance. Statistics can give you any result that you want, so choose a result that reflects your ideal of the right balance.
posted by -harlequin- at 12:42 AM on June 22, 2005

Actually, to be more helpful, I'd suggest starting in the range of aound 110 to 120 points for first place - it means that a high placing on a list is very valuable, but it only takes about four or more low placings to trump a lone high placing, which was the scenario that concerned you.

/Not a mathmatician, but spent part of today programming wee Lego robots to evaluate the numbers that their sensors are sending them. :-)
posted by -harlequin- at 12:56 AM on June 22, 2005

I think you need to include a consensus factor into the scoring mechanism. The goal is to include an option in the final list when lots of people rank the option, even if they rank it low. Off hand, I'm not aware of a fair method to include consensus into group-rankings but that in no way means that one doesn't exist.

(And I'm not sure if changing the scale will do anything as long as the scale remains linear. If you used a 200-101 range, in your example you have 98 vs. 72 which amounts to the same problem.)

Maybe someone can confirm or refute but I think you're running into Arrow's impossibility theorem. (Warning, not calculus but economics.) It basically says that if you have two or more voters and three or more options, you won't be able to determine a "fair" ranking.
posted by sexymofo at 1:43 AM on June 22, 2005

To smoothe things out more, you might want to look into discarding the highest and lowest scores (including 0) for each votee. Or even the two highest and two lowest.
posted by rjt at 1:43 AM on June 22, 2005

IMDB uses the following ranking algorithm for their top 250:

weighted rank (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C

R = average for the movie (mean) = (Rating)
v = number of votes for the movie = (votes)
m = minimum votes required to be listed in the Top 250 (currently 1250)
C = the mean vote across the whole report (currently 6.8)

You might have to do some modification (for m and C) but it seems to work.
posted by PenDevil at 1:49 AM on June 22, 2005

sexy mofo:

Changing the scale to 200-101 points for 1st-100th place means:

One person votes athlete A at ~#50, thus they get 149 points
Four people vote athlete b at ~90, thus they get 436 points (4 *109)

Thus the low group vote easily trumps the lone high vote.

In fact, it actually goes too far and trumps the individual by too much, in my opinion, which is why I suggested starting the top score at around 110 or 120, instead of 200, to balance between the group and the individual.
posted by -harlequin- at 1:59 AM on June 22, 2005

Rank the median votes?

Also, are they working off a master list of all potential athletes or just thinking them up? Because if the latter, an athlete might be omitted (as opposed to ranked low) just because they didn't spring to mind. And the median method would pull down that persons score unfairly.

I would get the lists, put all the names together onto one giant list (alphebetize or randomize -- randomize is better because then they have to look at them all rather than seeking out the names they want -- better yet, hand them a pile of randomized cards with a name on each) and then make the 12 people rank everyone one the list. Then take the median.
posted by duck at 5:54 AM on June 22, 2005

harlequin-: Duh. Thanks for helping my feeble math abilities.
posted by sexymofo at 6:31 AM on June 22, 2005

PenDevil: "IMDB uses the following ranking algorithm for their top 250"

This is called a Bayesian estimation. A little more about using it. I recommend using this, especially if you expect the average number of votes to be moderately large (hundreds or thousands of votes).
posted by Plutor at 6:59 AM on June 22, 2005

it just struck me that most voting schemes are effectively non-parametric, but that the algorithm described here (which involves ranking) isn't. which is odd, since non-parametric tests normally involve ranking(!).

so is there a non-parametric approach to this? i assume you could invent one, but is there anything already in the literature?
posted by andrew cooke at 7:57 AM on June 22, 2005

What, exactly, is the problem with having someone downgraded by the group? I mean, if only one person ranks Athlete Dave at #1, but four people rank Athlete Carl at #8 and Athlete Dave doesn't get any other votes, doesn't that mean that Athlete Dave was a bit of an outlier to begin with?
posted by klangklangston at 12:07 PM on June 22, 2005

Thanks all for the very useful words - I think I'll go head with a 110 point system or something like that. PenDevil your solution looks thorough but might be a bit complex for me. And thanks sexymofo for the theorem - makes me feel safer.
posted by meech at 7:27 PM on June 22, 2005

« Older Me love you short time   |   The economics of bogus search engines Newer »
This thread is closed to new comments.