Question about Elo ratings
October 3, 2007 3:14 PM   Subscribe

How to implement an Elo-based rating system for Go?

As an exercise in keeping my programming skills up to date, I recently wrote the game of Go as a Facebook application. Right now, the game has a few hundred members who can play against their friends or practice against a GnuGo bot.

I'd like come up with a way of rating how well the players do against each other. After a bit of research, it seems that some kind of Elo system is the way to go.

I have a number of questions:

The wiki article on Elo gives a formula for calculating a new rating based on the result of a contest, but it assumes both players had a rating to begin with. If I'm starting in a vacuum, should I just assume a player has an Elo of X at their first game? What's a good value for X? Will this affect what the K-factor should be?

Also, the players may be playing each other with handicaps. How do you calculate an Elo change when one player has elected to play with a handicap (which will be common)?

If someone can point me towards some pseudo-code (or better yet, php code) implementing Elo, that'd be great, too.
posted by justkevin to Sports, Hobbies, & Recreation (12 answers total)
 
Wouldn't it make more sense to do a kyu-type rating system? The Wikipedia article on Go rankings has lots of useful info for you. You may also look at what the online Go community does; a lot of serious Go is played online these days.

Last time I looked at how the USCF did Elo ratings, your first 20 games or so were ranked under a different system as a provisional ranking, to seed your eventual Elo rating.

This subject can be as complicated as you want to make it.
posted by Nelson at 3:30 PM on October 3, 2007


This may be useful.

But I urge you in the strongest possible terms to stay with kyu steps in your rating system and not to venture into Elo territory. Kyu and dan are traditional; Elo is vulgar.
posted by Steven C. Den Beste at 3:30 PM on October 3, 2007


IIRC, the Dragon Go Server has newcomers self-designate their ranking, which is then of course adjusted as they win and lose.
posted by exogenous at 3:34 PM on October 3, 2007


Yeah, this is go, Elo blows hard. Kyu or bust!

I'd be somewhat interested in this app, link?
posted by phrontist at 3:34 PM on October 3, 2007


Hmm... I may have to recant my previous statement, I can't find any algorithms for Kyu/Dan ranks that don't use elo equivalence.
posted by phrontist at 3:40 PM on October 3, 2007


What algorithm? We don't need no stinkin' algorithm.

The easiest one is to feather: every time you lose your Kyu rating goes up, every time you win your Kyu rating goes down. When you play another player whose rating is different than yours, the handicap is a function of the difference between your ratings.

(If you want more stability, make it two losses in a row for your rating to rise, two wins in a row for it to fall.)
posted by Steven C. Den Beste at 4:00 PM on October 3, 2007


Response by poster: Okay, I'm not hearing a lot of love for Elo, and am happy to tell people their Kyu/Dan, but I need a way to figure that out.

If you're familiar with Facebook, you may realize that there isn't a lot of structure here-- people add an application they think sounds interesting and if they like, they invite their friends. So I've got a few hundred players, and a gradually increasing database of game match-ups. I'm looking for a formula to say, "Based on the games you've played with others, your skill level is X" where X is a number, a Kyu/Dan rating, whatever.

Phronist, the app url is http://apps.facebook.com/gothegame/ - you must be on Facebook to use it, though.
posted by justkevin at 4:12 PM on October 3, 2007


The way you do it is to seed the system with a number of players who know what their rating is. Anyone who doesn't know gets has to play against someone whose rating is known, and then the handicap they received plus the score gets used to make a provisional rating. Basically, if your first game is against a 7 kyu and you got 4 stones, then your provisional rating is 11 kyu. Then that gets adjusted up or down one step for every 25 points of score mismatch. If you won by 52, your provisional rating would actually be 9 kyu. If you lost by 52, your provisional rating would be 13 kyu. If you get totally creamed, you don't get a rating and have to seek out a rated player with a higher score to compare yourself against.

That is the starting point for the score fluctuation as described above. All ratings would be considered provisional until the player had completed ten games.

Anyone joining the system has to go through the same thing. The adjustment algorithm only counts games against players with non-provisional ratings, which means the seed players don't get tossed around wildly while they're calibrating non-rated players.
posted by Steven C. Den Beste at 4:28 PM on October 3, 2007


A different way to do the feathering algorithm is to only count games with lopsided scores. If the game is within 20, no one's ratings change. If you win two games in a row by more than 20 points, your rating goes down one step. If you lose two games in a row by more than 20 points, your rating goes up one step.

Or some variation on that idea.
posted by Steven C. Den Beste at 4:30 PM on October 3, 2007


Here's an explanation of the system used on my favorite server. May or may not be helpful.
posted by squidlarkin at 6:17 PM on October 3, 2007


Not sure why everyone is hating on ELO...it's a perfectly good ranking system for games.

I've instituted ELO twice, once on an international level for a Collectible Card Game. Our baseline was 1500, and everyone went up/down from there. Worked well for us (3000 ranked players or so at the height of the game).
posted by griffey at 7:41 PM on October 3, 2007


Griffey, we're down on Elo because it's not traditional. There's already a perfectly good ranking system for Go and there's no good reason to graft a new, different one on top of it.
posted by Steven C. Den Beste at 8:10 PM on October 3, 2007


« Older Wanted: Chinese paper yo-yos.   |   ilu Word, but srsly WTF??? Newer »
This thread is closed to new comments.