How do computer adaptive tests work?
January 8, 2004 12:44 PM   Subscribe

How do computer adaptive tests work? What do the algorithms look like (more inside)?

I can find very general information about how tests like the GMAT CAT work, but I'm looking for something with a little more detail. Do the questions usually come from two or three difficulty "bins", or are they more analog? Is there academic theory out there about how many questions it takes to find a competency level or percentile?
Are any of the algorithms public?
posted by trharlan to Computers & Internet (6 answers total)
Having taken a number of adaptive tests I Googled around a bit. They seem to be based on something called Item Response Theory. This site might answer some of your questions. Way too much math for me, though.
posted by Cyrano at 1:03 PM on January 8, 2004

I worked at the Princeton Review when the GMAT CAT was starting up (around 97-98), and they devoted a lot of brainpower toward reverse-engineering the CAT methodology. Believe me, it's highly advanced mathematics. As for the "bins", there are definitely more than three, but I remember the manual "grading" of each question not being too complex. There is a self-adjusting component to the algorithm as well, evaluating the difficulty of questions based on how many people answer them correctly (and, in turn, what other questions they answer correctly). This component may, in fact, be the primary one.

And no, the algorithm isn't public.
posted by mkultra at 1:31 PM on January 8, 2004

mkultra-- Did it ever occur to PR to reverse-engineer the powerprep software?
posted by trharlan at 1:40 PM on January 8, 2004

I'm not sure exactly what got reverse-engineered, but I vaguely recall that this initiative had been going on for a while, before Powerprep was released.
posted by mkultra at 1:44 PM on January 8, 2004

I worked at Princeton Review for a while as well. Things may have changed since I was there but, knowing ETS, they probably have not. The self-adjustment part was really important. The idea we were working with at the time was that how you did on each question to some extent determined what question you would be fed next. So, we used to tell students that if they only wanted a nice median score, the useful tip was to blow the first few questions [which are generally easier] and that seemed to place the student on a path with many more easier questions than if they got the first three right. Of course, once you got on this "easy track" you couldn't wind up with a top score, but if you were an ESL student or someone else who just wanted to finish and do okay, this technique worked well. My old boss at The Princeton Review [contact info here] still spends a lot of time in the trenches with this stuff, might be worth dropping him a quickie email to see if they know anything else.
posted by jessamyn at 1:51 PM on January 8, 2004

I work at a company that assists in the development and administration of a number of computer-adaptive tests.

Specific information regarding the algorithms that govern CAT administration isn't available to the general public. All employees of companies like the one at which I work sign non-disclosure agreements.

However, it is public knowledge that, in any given administration of the test, some (unidentified) sets of questions don't count toward your final score, but are "pretest" questions. The statistics for those "pretest" questions are then used to gauge their difficulty (among a number of other variables) if and when they are administered to subsequent candidates as actual test questions, with scores that count.
posted by Prospero at 2:27 PM on January 8, 2004

« Older Trance/techno rhythm question   |   Gambling in Reno Newer »
This thread is closed to new comments.