Help me solve problems with a peer judging contest!
December 17, 2009 7:07 AM   Subscribe

Help me devise a peer judging system for a contest that will contain approximately 135 members, give or take a few.

Key Details:

-The event will last five hours with a short break for lunch.

-The presenters are required to be at their exhibits except for the short lunch break and the time they spend judging others' exhibits, at which time those "others" must be present to explain their exhibit.

-The event must have peer judging. Other forms of judging cannot be substituted.

Last year the event was rigidly structured beforehand. Each registrant was assigned to rate 11 other exhibits which were predetermined, spending ten minutes at each exhibit. However, this caused a myriad of problems because on the day of the event people who were not registered showed up with exhibits and some did not show up all, throwing the entire system out of whack.

Scoring was based on three fields. In these three fields, peer judges would rate the exhibits they saw from 1-11 and the exhibits' overall score was taken from everyone's rating of it.

Assigning numbers and creating a system on the day of the event has been considered too problematic given the amount of other activities going on.

Does anyone have a creative solution to this problem? If you need any other information, just ask! The problem isn't so much the scoring system as making sure the judging assignments are doled out in an efficient manner before the day of the event.
posted by Modus Pwnens to Grab Bag (5 answers total) 1 user marked this as a favorite
Can you set up the spots for the exhibits beforehand and number them, but not assign numbers until people check-in? Then you could fill them up in order of attendance, so #1-#12 are one group, and they all judge each others, then #13-#24 are the next group, and so on. By filling the exhibit spaces as people arrive, you ensure that there are no empty spots in groups and everyone will be judged. You could also pre-set the time that people need to be at their spots and note that on the judger's assignment lists.

You might also want to leave empty spots between groups as overflow/extra spaces (e.g. skip #13 as you assign slots), so that say, if you get an unfriendly number of turnouts, like say 138, you could then slot them in there (#13, #25, etc), so that instead of twelve groups of 11 and one group of 6 at the end, you have six groups of 12 and six groups of 11. This depends on whether having groups that are close in size is important to your ranking system.

I also envision that in the 5 hours, you could have require each exhibitor to be at their place for say 1 hour, and then stagger these one hour time slots among each group of 12 (or 13). That would probably be the hardest logistical piece to work out, but you could make it so that no one completely overlaps and then write it out so that you note the times when they should/could be judging each of their group members.

E.g: "#1, you are required to be at your exhibit from 1-2pm, you are required to judge #s 2-12, and #13 if they are present. They will be at their exhibits for these times: #2: 115-215pm, #3: 130-230pm, #4: 245-345pm, .... #10: 415-515pm, #11: 430-530pm, #12: 445-545pm, and #13 (if necessary): 500-600pm"

That seems to solve all the problems that I can see, allows you to do most of the logistical legwork ahead of time, and really only means that the judges have to hustle to mark their immediate neighbours within the short times where they don't overlap (15 mins before and after). For everyone else they can go at a somewhat leisurely pace and have time to look over the other exhibits that aren't in their group if they want. I wasn't sure if the five hours includes your lunch break or not, this plan assumes not, it should be adaptable though, depending on the style of your lunch.
posted by dnesan at 8:57 AM on December 17, 2009

Which did you have more of--late entries, or no-shows? Is there a reason why an unregistered exhibit can't just take a no-show's slot? (i.e. either take the no-show's physical space, or have a sign in the no-show's space redirecting judges to the new entry) Of course, if you have more late entries than no-shows, then there's trouble.

Maybe you could put extra space on the judging lists--give each list 13 exhibits, for example, where 2 out of the first 11 on each list are "late entry #17" or some such. Tell each judge to visit the first 11 exhibits on the list that actually exist. So if there is no late entry #17, the judges who have that exhibit just skip right over it; if there is such an entry, they judge it, and they skip the last exhibit on their list instead. Each late entry that shows up is going to bump 11 other entries off the ends of the existing judges' lists, so those are the 11 that the late entrant should judge. It'd be complicated to plan all that out, but it can all be done in advance, at least.
posted by equalpants at 8:58 AM on December 17, 2009

There is no meaningful way to solve this problem.

The very fact that you say the event "must have peer judging" makes me question motives. Peer judging of this kind is open to collusion and there is no foolproof way you can prevent it.

For example, when you say "The problem isn't so much the scoring system" I wonder how well you have thought this through. Assuming you use the method you describe, if I wish to increase my own exhibit's relative score, all I have to do is give every exhibit I judge a somewhat low score. This would increase my own relative aggregate score even though I do not judge my own exhibit. If I got a buddy in on it then we can collude to give ourselves an even bigger advantage. Even if there is no tangible benefit from doing so, status or ego boost is enough to encourage people to meddle.

Next, even if people are purely altruistic, since everyone isn't judging every exhibit, some people will be judged by more pessimistic judges who give lower average scores, and some will be judged by more optimisitic judges who give higher average scores. Thus, the outcome of an exhibit's score is more a measure of the personalities of the judges rather than the exhibits themselves.

However, if you choose to follow this path anyway, here is a suggestion:

Each exhibit keeps piece of paper with a tally of how many judges have judged them so far. Have each judge write down their own name on the piece of paper at each exhibit they judge.

Each judge has to judge 10 exhibits.They add their name to the tally at each table they judge. They can pick which exhibits they judge randomly, but they cannot judge an exhibit that has reached 10 names, they must move to an exhibit that is has fewer than 10 tallies.

If you like your rating system you can use it but i suggest:

If you want one that is more 'fair and robust' [to collusion and bias], judges can keep a private score (say from 1-100) for each field of each exhibit they judge. After they have finished judging 10 exhibits, they then sort (for each field) from highest to lowest based on recorded their opinions. The highest gets 10 points, the next 9 points, and so on down to 1. [in the event of a tie, they must judge which is better, or if they cannot do so, give them the same number of points] The purpose of this is to normalize the scores so scoring biases are reduced.

In either case, at the end, rather than just adding up the numbers, add the numbers and divide them by the number of judges that judged that table. This allows for the problem when the number of judges at each booth is not the same (because someone didn't do all 10 like they were supposed to).
posted by Osmanthus at 9:13 AM on December 17, 2009 [1 favorite]

Just wanted to note that there is a solution in my comment that does not require assigning numbers, so read the whole thing! :D
posted by Osmanthus at 9:26 AM on December 17, 2009

The very fact that you say the event "must have peer judging" makes me question motives. Peer judging of this kind is open to collusion and there is no foolproof way you can prevent it.

The peer judging aspect is mandated by those higher up the command chain than myself. I have to incorporate peer judging despite there being more efficient (and probably equitable) solutions.
posted by Modus Pwnens at 10:18 AM on December 17, 2009

« Older Now that we have a region free...   |  Should we opt for the drop-sid... Newer »
This thread is closed to new comments.