Testing bias; how can it be so?
April 3, 2004 8:12 AM   Subscribe

How can tests be biased? Isn't it simply a matter of understanding concepts related to the material and the language the test is given in?
posted by @homer to Education (10 answers total)
I think it's a case of some tests take advantage of the "commonly known" and build on that to test other material. If you're out of your element on the "common" factor, then the test will be harder.

For example:

Mary has written an average-length feature screenplay. John has written an average-length MOW screenplay. Which of the following is true?

a) Mary's screenplay is longer than John's.
b) John's screenplay is 120 pages.
c) Mary and John's screenplays are the same length.
d) Both B and C
e) None of the above.

Technically, it's a logic problem (if A has quality X and B has quality Z, which statement is true?) but unless you know jack diddley about screenplay length, you can't answer it with confidence. You probably understand the concept of an A with quality X, B with quality Z question, but if it's using a "common" that's unfamiliar to you to express it, it's biased. (In case you're curious, the answer is (e).)
posted by headspace at 8:34 AM on April 3, 2004

There's always that classic analogy that once appeared on the SAT -- Yacht::Regatta.

However, it seems that the standardized testing industry has gotten better. That they're more careful to select concepts that are more universal and less specific to certain socioeconomic groups. They've modified the SAT 3 times in the last 10 or so years, which both speaks to the problems of the test, and the effort to try to correct them.
posted by herc at 8:52 AM on April 3, 2004

Mod note: This is exactly right. Plus, when the "common" knowledge happens to preference one gender, racial or class group, this is where bias becomes a problem. The example above would be fine if it were some sort of film class, but if it's trying to test the elusive "general" knowledge, then it's a question that will be biased towards screenwriters. The attempt to smooth out these biases can result in some odd avoidance tendencies like eliminating questions about polo [class biases] or attractiveness [culture bias] but probably drawing the line at talking about snow [regional bias, but people have seen snow on TV so it's okay]. Here's some more info on these biases from Chronicle of Higher Ed, the Harvard Political Review and the Journal of Blacks in Higher Education [with that regatta question].

The issue coming up lately is not that the tests may be biased in some way, but that the creators of the tests, in trying to make the test evoke a wide score range [the classic bell curve where some do very well, most do okay and some do very badly] actually have to unknowingly build certain biases into the tests in order for some people to do well and some to not do well, otherwise everyone who knew math would be able to get the math questions. These biases tend to fall out among ethnic lines, though more accurately along class and family-income lines. No one thinks this is some sort of intentional conspiracy, but it's just an unfortunate side effect of the standardized testing process and the pressure to try to find one test that can test all students and get the same bell curve each time it's given.
disclaimer: I have worked both scoring these tests for ETS and coaching these tests for The Princeton Review [a decade apart], I am far from objective on this issue
posted by jessamyn (staff) at 8:56 AM on April 3, 2004

Having recently taken the PSAT, the SAT, and the SAT2 Writing, I've noticed little bias in these tests - they do seem to be doing their best to keep it fair in these ones that "really matter." However, I had my IQ tested as part of a series of learning disability tests at the beginning of last year, and there the situation was somewhat different. There were several questions that struck me as unfair, particularly one where I was asked, "Who wrote Faust?" Even as a sophomore at a private school, I'd only recently leaned that it was written by Goethe (and what on earth does literary trivia have to do with intellectual competency?). At least the College Board seems to be doing things a bit more fairly.
posted by bubukaba at 9:11 AM on April 3, 2004

There were also problems in earlier intelligence tests, where only one answer was correct. The example:

Rearrange these letters to spell a common word: CTOA.

If children answered "coat," they were correct; "taco" was not included as a possible answer.

There's also the idea that different people are better at different types of questions (multiple choice vs T/F vs short answer, etc.)
posted by gramcracker at 9:12 AM on April 3, 2004

Anyone who's seriously interested in this subject should look into Howard Gardner's theory of multiple intelligences. In a nutshell, he argued -- over twenty years ago now -- that the idea of a being able to define a single figure for a human "Intelligence Quotient" is a deeply flawed concept. (My IQ was determined to be around 145 on the Stanford-Binet scale when I was 11, and I was certainly no genius, just a middle class geek with highly educated and cultured parents.) In my view, all standardised, multiple-choice testing is not just suspect, but almost completely useless in quantifying and scoring even a tiny proportion of the full range of human abilities.
posted by cbrody at 9:33 AM on April 3, 2004

Sorry to get into a philosophical rant, but all testing is biased. Whether it's considered a bias, depends on the consensus around the material. If for some reason, you don't hold some axioms to be "self-evident truths", then you'll consider deductions based on that premise, flawed. This would extend to all axioms, including ZFC in math, or wherever. A test setter has her own conception of what she intends to communicate and evaluate. You have your own perception of what the test is asking. If your's and her's match, primarily with each other, and secondarily with the general consensual knowledge, then you call the test unbiased.
posted by Gyan at 10:00 AM on April 3, 2004

In my view, all standardised, multiple-choice testing is not just suspect, but almost completely useless in quantifying and scoring even a tiny proportion of the full range of human abilities.

Hear, hear. I wish this were more commonly recognized.
posted by languagehat at 11:57 AM on April 3, 2004

To take it one step further, I believe that any attempt to "quanitify" human intelligence or emotion is bound to run into limits pretty quick. I've been reading countless such quantitative studies in my Speech Communication graduate program. Every one of them contains leaps of faith, such as when the subject performs action A, it means this certain thing or that. The reason these kinds of evaluations continue to this day is that the practitioners are loathe to adopt less-precise but more accurate qualitative methods.
posted by squirrel at 1:12 PM on April 3, 2004

I've taken many tests in which, while I was able to ascertain the answer the test-writer was looking for (and choose that answer, and score well), I disagreed with the test-writer's interpretation. This was especially true of the "reading comprehension" sections on the SAT. One's general worldview tends to affect what one gets out of literature; my friend and I have extremely different interpretations of Hesse's Glass Bead Game not because of any significant difference in intelligence but simply because we see our own divergent worldviews reflected in it.
posted by IshmaelGraves at 9:43 PM on April 3, 2004

« Older Looking for lyrics that feature "rumble in the...   |   What is the benefit of using a "solid" deodorant... Newer »
This thread is closed to new comments.