I'm a rank amateur
July 26, 2011 10:20 AM   Subscribe

What are some basic math concepts I can use to understand and develop a ranking system?

I have a bunch of internet-things that have different levels of the same attributes: followers, views...stuff like that. How do I boil these down into some kind of comparative rank? Complicating factor: an optional SuperWeight that should float certain things to the top of the list.

It's been about 20 years since I've had any formal math training, so my statistics are weak. This means that for all the reading I've done on Elo, Reddit/Digg/Amazon, etc., I can't seem to fit the variables in those equations into my brain in a meaningful way. Furthermore, I'm thinking of views and followers as "positive votes," but this removes a lot of the "negative vote" offsets that I see in, say, Bayesian classifications, which renders the equations confusing to me.

What I'm looking for is perhaps a primer on choosing A, B, and C to be Important Qualities, while also understanding why they would be multiplied together, logarithmatized, or divided by e.g. age of the thing being ranked (or time since last view...stuff like that). It seems to me like there's a fundamental I have forgotten since my schooling that makes all this a short jump from "meters per second" to "view-minutes per follower per day of age." Any tips? I use Ruby-the-programming-language if it matters, and I can provide more explanation if necessary.
posted by rhizome to Education (7 answers total) 1 user marked this as a favorite
 
Predictive analytics?
posted by Consult The Oracle at 11:30 AM on July 26, 2011


It seems to me like there's a fundamental I have forgotten since my schooling that makes all this a short jump

Possibly, but this would be a little surprising to me. If I understand your situation, you've got a bunch of different attributes that are mostly not comparable in a quantifiable way (e.g.: how many views is one follower worth?) so that can't be easily rolled up into one score. In my limited experience, this kind of multi-dimensional ranking tends to be highly situation-specific. A lot of times, people just assign more or less arbitrary weights to the different attributes (e.g.: one view is worth 0.1, one follower is worth 10, etc...) and just force them to sum up to something. Of course, this will be highly dependent on the arbitrary weights you choose and how your different attributes interact (e.g.: should views per follower be scored as a separate attribute?). And, all of this is without even considering the possibly time-varying nature of these attributes.

If you can, I'd suggest providing a lot more detail about what your situation is and what you're trying to achieve with your ranking.
posted by mhum at 12:14 PM on July 26, 2011


seconding mhum, more detail seems needed. Maybe define the problem by comapring it to some existing service (ie; I need a system like Netflix's recommendations, I need a system like the x out of 100 system used to rate beer and wine, I need a system that predicts if you'll like X based on your history like Amazon's. Then we can get into the specifics of that like system ?

But offhand, machine-learning, statistics, fuzzy logic, mult-variable calculus, and euclidean geometry all sound like decent concepts to go over when trying to develop a ranking system. Wolfram Alpha is a great starting point for any concept you want to dig into more.
posted by oblio_one at 12:21 PM on July 26, 2011


The O'Reilly book on building reputation systems might be helpful. Tangentially related, at least.
posted by wam at 2:13 PM on July 26, 2011


Response by poster: If you can, I'd suggest providing a lot more detail about what your situation is and what you're trying to achieve with your ranking.

Sure, yeah.

These are just events with followers and views, say what you might have on facebook or youtube. Since there's no way to have a negative view or follower, they are just upvotes (for the purposes of the ranking algos I've found on the web). So I have two kinds of upvotes, each of which I can weight by multiplying them by some factor, 0.1*views to make followers 10x powerful.

I think the question I have is...when would I divide them by another factor? As above, I figure I can add or multiply (how to decide?) each of those classes of "votes" together, and then log(x) them in order to put them on a 10 or 100 point scale? I suppose division can be used to age them out, but still, the whys and wherefores escape me and I'm flying a bit blind here. Perhaps wam's Oreilly will fit the bill, but I still think I'm just not thinking about it properly.

I don't need machine learning so much as a way to look at even a small sample and derive some way of comparing them. I'll have a boss saying he wants views to count more than followers and I want to know how to create that. Anyway, thanks so far!
posted by rhizome at 3:55 PM on July 26, 2011


rhizome: "I think the question I have is...when would I divide them by another factor? As above, I figure I can add or multiply (how to decide?) each of those classes of "votes" together, and then log(x) them in order to put them on a 10 or 100 point scale? "

If it's just for ordinal ranking, you don't need to put them on a fixed scale - just put them in order.

rhizome: "I'll have a boss saying he wants views to count more than followers and I want to know how to create that."

Just multiply views by a bigger number than followers. You might also want to take logs of some numbers so that, for example, 1000 views only counts 2x as much as 100 views instead of 10x as much and consider making 1000 views and 500 follows count more than 10x as much as 1000 views and only 50 follows.

Other than that, I think it's mostly trial and error to get something that seems reasonable.
posted by turkeyphant at 4:13 PM on July 26, 2011


You want to add the individually weighted attribute scores, by the way, not multiply them. If you multiply them, any attribute that weighted to zero would zero its item's whole score, which is probably not what you want.

Easiest way to do the SuperWeight thing is just to use a weighting multiplier for any HasSuperWeight attributes that's a couple orders of magnitude higher than any possible result from multiplying the highest possible raw attribute value by that attribute's weighting. That way, when your boss chews you out for getting the (unspecified of course!) conflict-resolution order for SuperWeight items wrong, all you have to do is change a few SuperWeight weighting values instead of finding and patching a bunch of special-case code.
posted by flabdablet at 6:44 PM on July 26, 2011


« Older Help me burn a DVD   |   help me make my swap successful! Newer »
This thread is closed to new comments.