How do I process user ratings of content fairly?
July 10, 2006 11:01 PM   Subscribe

How do I process user ratings of content without skewing results of items that have only been rated once?

I'm writing a Web app that allows users to rank entries made by other users from 0 to 4 stars (sort of a "hot or not" for user created content). Users can view a list of items that have gotten the highest average ratings. Now imagine that a new item is created, and someone ranks it 4 stars. It immediately shoots to the top of the list because it's average ranking is 4 stars. Can you suggest a good way to address this problem? I do not want to require items to have a certain number of ratings before they can be listed because I beleive people will want to see their favorites appear in the rankings right away, and I also don't want to discrimate against including something new in the list.
posted by Orkboi to Technology (19 answers total) 2 users marked this as a favorite
Kinda thinking outloud here but what if you sorted by average rating x number of ratings
(4x1=4) vs. (4x25=100)

If you wanted to be more in depth you could add every rating together so (3x4stars=12) + (1x4stars=4) = 16

Or just sort by rating, then by number of rankings? New stuff would jump to the top layer, but not to the top of the list.
posted by Brainy at 11:19 PM on July 10, 2006

perhaps something like this:

* = -2 points
** = -1 points
*** = +1 points
**** = +2 points

keep a running total of the points scored per entry, and go off that. items that a lot of people like will have a much higher score than items only 2 or 3 people have voted as liking.
posted by sophist at 11:37 PM on July 10, 2006

On kuro5hin, comments are not considered scored until a certain number of people have voted.
posted by grouse at 11:38 PM on July 10, 2006

he specifically said:

I do not want to require items to have a certain number of ratings before they can be listed
posted by sophist at 11:40 PM on July 10, 2006

Per grouse, most "rating" type websites wait until an item has five or ten ratings before the rating is publicly visible.
posted by evariste at 11:40 PM on July 10, 2006

Shit! I'm not the only one with reading comprehension problems, grouse.
posted by evariste at 11:40 PM on July 10, 2006

Best answer: Let A be the current average of all ratings. For any particular item, compute the displayed rating using a weighted average of A and the item's true rating:

displayed rating = (N*A + K*R)/(N + K)

where K is the number of ratings for the item, R is the item's average rating, and N is a small value that you choose. New items will initially be rated near the average, but the effect of the weighting becomes small as K gets large.
posted by Galvatron at 11:45 PM on July 10, 2006 [1 favorite]

There are lots of ways to do it. The simplest I think is to give everything a 'default' rating with a certain weight. Similar to what Galvatron proposes, but I would use

Rating = D*WD+∑(R)/|R|

Where D is the 'default' score, WD is the weight of that score (maybe three? maybe ten?) ∑(R)/|R| Is just a fancy way of saying "the average of the ratings". It's the sum of all the ratings divided by the number of ratings where. A simplified formula would be:

Rating = D*WD+[average rating]
posted by delmoi at 11:53 PM on July 10, 2006

Hmm, now that I think about it my formula isn't quite right. It should be:


So basically what you're doing is giving it WD 'default' votes.
posted by delmoi at 11:57 PM on July 10, 2006

Which is exactly what Galvatron's formula does too.

Simplest way to do this is probably to preload your ratings database with a small number of "neutral" ratings for new items.
posted by flabdablet at 3:11 AM on July 11, 2006

Sophist's method fails because it ranks items that a lot of people have rated moderately ahead of items that a few people have rated highly.
posted by flabdablet at 3:14 AM on July 11, 2006

I was going to suggest starting an entry with a certain number of average ratings. So when a new entry is created it would have one two star rating automatically and if it got a 4 right away it would only go up to 3 stars.

But the problem with under-rating something right off the bat is that there might be people who only look at high rated items.
posted by jefeweiss at 6:27 AM on July 11, 2006

Simple solution: show the average rating together with number of votes: that enables people to judge for themselves how much faith to put in the rating.

More complicated solution: assign a confidence rating to every rating based on the number of votes, and normalize the score based on that confidence rating. I'm ashamed to say I'm not good enough at math to work this out, but for any given item, there's a number of votes after which the score will not change significantly (the "confidence interval"). This is your "number of confidence," which we'll call NC. At one vote or a few votes, every additional score is likely to change the score a lot, for the opposite reasons. As N approaches NC, you've got more confidence that the current average is an accurate representation of what the average will be when you reach NC. Perhaps somebody who actually knows statistics can take it from here.
posted by adamrice at 7:28 AM on July 11, 2006

evariste: it's not that we have reading comprehension problems (which, at least, is a literate insult (pun entirely accidental) :-), it's that there's no really good way to have both behaviors. If you don't weight, and you don't delay, then things rated highly by their first raters will jump to the top; it's the nature of the beast.

You have to compromise *something* if you want to avoid that behavior.
posted by baylink at 9:49 AM on July 11, 2006

posted by grouse at 10:29 AM on July 11, 2006

From memory I think Amazon ratings include an element of time, so new ratings (positive or negative) have less influence than older ratings, the theory being that over time more ratings will appear and balance things out.
posted by Lanark at 1:21 PM on July 11, 2006

as adamrice has said, there's no way to do this without adding some other factor to weight the rankings.

To really do it right, any rating given would be weighted by the average rating of the items the rater has himself submitted previously. To bootstrap this in, perhaps people who have rated more items get more weight assigned to their ratings, or perhaps it would be done by number of posts or length of membership.
posted by Mr. Gunn at 3:37 PM on July 11, 2006

I saw this a few months ago but never got it going with a DB to retain the settings, I just checked back today and there is a DB & PHP tutorial added on.

Hope this helps, I will be working on it for my blog tomorrow as I keep seeing it at sites all over the place. G/L.
posted by BillyG at 8:00 PM on July 11, 2006 [1 favorite]

Response by poster: Thanks everyone who answered! This has helped me understand the problem and possible solutions (plus trade offs) much more clearly.
posted by Orkboi at 8:18 PM on July 12, 2006

« Older Assistance with JavaScript   |   How to get out of wrongful late fees? Newer »
This thread is closed to new comments.