There's a good reason I went into music instead of engineering
May 8, 2006 7:50 PM Subscribe
Grading dilemma: Can the median be a fair measure of a set of quiz scores? Here's the situation ...
Over 9 weeks, students are given a weekly 5 point quiz (total of 45 points.) A score of 60% (a D) is required to pass the course (graded pass/fail).
That's not a lot of points and several students (who have otherwise performed well) tanked one quiz (1 or 0 out of 5), drastically reducing their scores. It seems in this instance that the average does not reflect their performance very well. I tried calculating the scores using the median and found a much better distribution: rather than 25% of the class failing, now down to a few students.
My goal is not to pass as many students as possible, but to avoid failing students who have performed well overall (with a minor slip up here and there). The scores based on averages will improve as more points are accumlated, but I'd rather decide now if I need to change the way their final grade is determined.
Over 9 weeks, students are given a weekly 5 point quiz (total of 45 points.) A score of 60% (a D) is required to pass the course (graded pass/fail).
That's not a lot of points and several students (who have otherwise performed well) tanked one quiz (1 or 0 out of 5), drastically reducing their scores. It seems in this instance that the average does not reflect their performance very well. I tried calculating the scores using the median and found a much better distribution: rather than 25% of the class failing, now down to a few students.
My goal is not to pass as many students as possible, but to avoid failing students who have performed well overall (with a minor slip up here and there). The scores based on averages will improve as more points are accumlated, but I'd rather decide now if I need to change the way their final grade is determined.
Response by poster: I thought about that, but it would only take more points out of the final total (and there aren't many to begin with.) I'd rather not just give everyone a free 5 points. Is there another measure besides median that might work?
posted by imposster at 8:02 PM on May 8, 2006
posted by imposster at 8:02 PM on May 8, 2006
So, somebody who scored a 0, 60, 60, 0, 60, 60, 0, 60, 0 would pass the class with a median grade of 60% and an average of 34%? Throwing out the lowest score or offering some kind of extra credit to get them 5 extra points seems like the better choice.
posted by willnot at 8:05 PM on May 8, 2006
posted by willnot at 8:05 PM on May 8, 2006
Response by poster: Hmmm... As the 0s pile up, the median becomes a problem. Dumping the lowest score is looking more appealing.
posted by imposster at 8:14 PM on May 8, 2006
posted by imposster at 8:14 PM on May 8, 2006
In my classes, I use reading quizzes for 15% of the semester grade. There are usually 6-10 quizzes in a semester. Dropping the lowest score has always worked for me, though you will have some students who manage to miss all or most of the quizzes.
posted by wheat at 8:20 PM on May 8, 2006
posted by wheat at 8:20 PM on May 8, 2006
Aren't you the decider? Why can't you just pass the students you think deserve to pass? I'm a being obtuse here?
posted by iconjack at 9:15 PM on May 8, 2006
posted by iconjack at 9:15 PM on May 8, 2006
I agree with the curve. Flat point raise given to everyone based on the average (when attempting to reach an average of 72-78, depending upon how lenient you are).
In the future, and I hope you understand that I'm only saying this from the perspective of the student, I think your method for testing seems very unfair. 5 point quizzes usually require the test-taker to know very specific and finite knowledge and if you don't know it, you lose everything. Further, since there are so few points (no tests, no homework assignments, no final exam), there is no way to recover from a couple of bad quizzes, even though you may ace 6/9 quizzes, which IMHO is inherently unfair and undermines the classroom experience.
posted by SeizeTheDay at 9:16 PM on May 8, 2006
In the future, and I hope you understand that I'm only saying this from the perspective of the student, I think your method for testing seems very unfair. 5 point quizzes usually require the test-taker to know very specific and finite knowledge and if you don't know it, you lose everything. Further, since there are so few points (no tests, no homework assignments, no final exam), there is no way to recover from a couple of bad quizzes, even though you may ace 6/9 quizzes, which IMHO is inherently unfair and undermines the classroom experience.
posted by SeizeTheDay at 9:16 PM on May 8, 2006
Response by poster: STD (sorry): "I think your method for testing seems very unfair."
I completely agree, which is why I'm trying to find a better way to determine the final score. The quizzes cover only the big ideas covered in lecture and take only a few minutes with the expectation that little or no outside of class time is spent studying. (Yes, it is a bizarre arrangement, but that is format I have to deal with.) So far, the students have responded well to the course (against all odds) and in my mind have demonstrated sufficient mastery of the material (for the most part.)
Of course there can be only one DeciderĀ©, and unfortunately that job is currently filled. Also, I'd like to have the patina of mathematical rigor in justifying the grades.
[Still waiting to be verbally bitch-slapped by a sophomore in Algebra I]
posted by imposster at 9:35 PM on May 8, 2006
I completely agree, which is why I'm trying to find a better way to determine the final score. The quizzes cover only the big ideas covered in lecture and take only a few minutes with the expectation that little or no outside of class time is spent studying. (Yes, it is a bizarre arrangement, but that is format I have to deal with.) So far, the students have responded well to the course (against all odds) and in my mind have demonstrated sufficient mastery of the material (for the most part.)
Of course there can be only one DeciderĀ©, and unfortunately that job is currently filled. Also, I'd like to have the patina of mathematical rigor in justifying the grades.
[Still waiting to be verbally bitch-slapped by a sophomore in Algebra I]
posted by imposster at 9:35 PM on May 8, 2006
This isn't a deeply mathematical question. Just average the grades, and drop the lowest. If you think there should be a greater spread in scores, but are stuck with a 5 point limit, give partial credit, like 3.2 or 4.5. This makes the tests effectively 50 points each instead of 5, but without changing the absolute scale.
posted by Mr. Gunn at 9:44 PM on May 8, 2006
posted by Mr. Gunn at 9:44 PM on May 8, 2006
Why are you using a measure of central tendency to measure individual accomplishment?
In a stunning display of coincidence, that link points to a page about people who take a 5-question quiz. But the instructor in the example is interested in determining the performance of the class in aggregate, so he finds the average score of the entire class. Averaging an individual student's scores doesn't tell you much, especially since the scores of any given student are likely to be highly correlated with each other.
The way teachers deal with the issue you're looking at is to discard the worst test score. Then sum the rest of them, and establish cutoffs for each grade. (You can divide the sums and the cutoffs by 8 if you like; it makes no difference.)
posted by ikkyu2 at 11:37 PM on May 8, 2006
In a stunning display of coincidence, that link points to a page about people who take a 5-question quiz. But the instructor in the example is interested in determining the performance of the class in aggregate, so he finds the average score of the entire class. Averaging an individual student's scores doesn't tell you much, especially since the scores of any given student are likely to be highly correlated with each other.
The way teachers deal with the issue you're looking at is to discard the worst test score. Then sum the rest of them, and establish cutoffs for each grade. (You can divide the sums and the cutoffs by 8 if you like; it makes no difference.)
posted by ikkyu2 at 11:37 PM on May 8, 2006
What if you threw out the lowest test score?
Given that logic, shouldn't you throw out the highest score as well, considering them both outliers? Then rescale along a Gaussian.
posted by meehawl at 5:45 AM on May 9, 2006
Given that logic, shouldn't you throw out the highest score as well, considering them both outliers? Then rescale along a Gaussian.
posted by meehawl at 5:45 AM on May 9, 2006
I think you need to clarify. You start with Over 9 weeks, students are given a weekly 5 point quiz (total of 45 points.) A score of 60% (a D) is required to pass the course (graded pass/fail). Then you say My goal is not to pass as many students as possible.
Your first statement delineates who gets a pass and who gets a fail. Your second statement makes it clear that you aren't looking to justify passing or failing any students unjustly. If everyone in that class ends up with 26 points at the end of 9 weeks, everyone fails. If everyone ends up with 27 points, everyone passes. That's math. There's no maybe to it.
If you gave one big test with 45 questions on it, would you still have this problem? Or is a problem that the difference between a 26 and a 27 is 2.22% as opposed to the difference between a 59 and a 60 on a 100-point scale is only 1.00%? Maybe you should ask 1000 questions over the course of the class.
I know you want to be "fair", but it seems to me that you're expressly looking for a way to pass a certain number of students, if not as many as possible. Or maybe you want to pass a particular student. I'm not certain which would be worse.
Fail everyone who didn't get 60%. There's your patina of mathematical rigor. It was staring you in the face all along.
posted by clearlynuts at 6:36 AM on May 9, 2006
Your first statement delineates who gets a pass and who gets a fail. Your second statement makes it clear that you aren't looking to justify passing or failing any students unjustly. If everyone in that class ends up with 26 points at the end of 9 weeks, everyone fails. If everyone ends up with 27 points, everyone passes. That's math. There's no maybe to it.
If you gave one big test with 45 questions on it, would you still have this problem? Or is a problem that the difference between a 26 and a 27 is 2.22% as opposed to the difference between a 59 and a 60 on a 100-point scale is only 1.00%? Maybe you should ask 1000 questions over the course of the class.
I know you want to be "fair", but it seems to me that you're expressly looking for a way to pass a certain number of students, if not as many as possible. Or maybe you want to pass a particular student. I'm not certain which would be worse.
Fail everyone who didn't get 60%. There's your patina of mathematical rigor. It was staring you in the face all along.
posted by clearlynuts at 6:36 AM on May 9, 2006
meehawl and ikkyu2 are making things harder than they need to be.
Why are you using a measure of central tendency to measure individual accomplishment?
In order to ascertain the central tendency of their accomplishment (a summary statistic of their performance) and relate it to a course grade (another summary statistic of their performance).
Trimming top and bottom and rescaling to normal is way too much work to solve this problem. The best way to solve this problem is to simply pass the people who deserve to pass. If imposster is a TA or otherwise can't come up with his/her own grading, then the best that can be done is to fail everyone with less than a 60 and tell the professor that it's a dumb way to grade because you had to fail X students who were actually performing well.
posted by ROU_Xenophobe at 6:53 AM on May 9, 2006 [1 favorite]
Why are you using a measure of central tendency to measure individual accomplishment?
In order to ascertain the central tendency of their accomplishment (a summary statistic of their performance) and relate it to a course grade (another summary statistic of their performance).
Trimming top and bottom and rescaling to normal is way too much work to solve this problem. The best way to solve this problem is to simply pass the people who deserve to pass. If imposster is a TA or otherwise can't come up with his/her own grading, then the best that can be done is to fail everyone with less than a 60 and tell the professor that it's a dumb way to grade because you had to fail X students who were actually performing well.
posted by ROU_Xenophobe at 6:53 AM on May 9, 2006 [1 favorite]
When in doubt, give a passing grade.
When you can adequately prove (via quiz grades, attendance, participation, whatever) that they did not master the course objectives, then give a failing grade.
posted by elisabeth r at 7:32 AM on May 9, 2006
When you can adequately prove (via quiz grades, attendance, participation, whatever) that they did not master the course objectives, then give a failing grade.
posted by elisabeth r at 7:32 AM on May 9, 2006
Don't know if this would work here, but here's what one prof did: He was known to be "hard but fair," so at the end of the course, most people's averages were in the 60s, 70s and low 80s. He offered an extra credit assignment (that only the motivated people would do), and whatever grade you got on the assignment, he'd add 10% of the points to your final grade. So if you were carrying a 78 average, and got a 78 on the extra credit, you'd get 7.8 points added to your average, bringing you to 85.8.
posted by xo at 9:50 AM on May 9, 2006
posted by xo at 9:50 AM on May 9, 2006
Why not grade based on a percentile? Take the top 80% of their scores and use those to complete their grade average. This does mean that (likely) one of the scores will be a pain in the ass to compute, but at least it's fair to everyone.
It's similar to how internet traffic is done, actually... ISPs think it's fair. Why not your students? :-D
A bit like throwing out the lowest score, but a bit more fair as well...
posted by shepd at 10:21 AM on May 9, 2006
It's similar to how internet traffic is done, actually... ISPs think it's fair. Why not your students? :-D
A bit like throwing out the lowest score, but a bit more fair as well...
posted by shepd at 10:21 AM on May 9, 2006
I have a few professors who give reading quizzes. They often throw out as much as 3 grades!
I agree a little bit with ROU_Xenophobe. If the professor is saying "I'll fail anyone with less than 60%," and that's half the class, then there's really nothing you can do thats NOT mathematical trickery to get them above 60% - you should just talk to the professor and develop a better scaling system for the reading quizzes.
If you ARE the professor, then geez, take control over your own grading system already! You control it, not the other way around!
posted by muddgirl at 10:26 AM on May 9, 2006
I agree a little bit with ROU_Xenophobe. If the professor is saying "I'll fail anyone with less than 60%," and that's half the class, then there's really nothing you can do thats NOT mathematical trickery to get them above 60% - you should just talk to the professor and develop a better scaling system for the reading quizzes.
If you ARE the professor, then geez, take control over your own grading system already! You control it, not the other way around!
posted by muddgirl at 10:26 AM on May 9, 2006
Response by poster: I recalculated some grades with the low outlier eliminated producing satisfactory results. Tossing out the low score is perhaps the best solution. The otherwise good students who bombed one quiz are saved from failing and the consistently underperforming ones will still be motivated to improve their scores.
posted by imposster at 10:38 AM on May 9, 2006
posted by imposster at 10:38 AM on May 9, 2006
I see it the other way. The burden of proof is frequently on the person creating the assessment that it really qualifies for parametric measures.
1: items must be carefully constructed in order to be treated as interval or ratio measures. In other words, is one point on this item equivalent to one point on that item? If you can't support this (and supporting this is hard, and more work than you probably want to do) then you really can't claim that the mean is the ideal or "most fair" measure of central tendency.
2: for skewed distributions, the median is frequently a more viable measure of central tendency than the mean. In many cases, you want a skewed or even bi-modal distribution in educational testing.
Much of it depends on whether you want to go with a norm-referenced test or a criterion-referenced test. If you want to go norm-referenced, then you find a way to get a nice and smooth distribution where you can fail a certain percentage of the population. If you want to go criterion-referenced, the ideal would be a bi-modal distribution with a hump of scores at the mastery level, and a hump of scores at the failure level, and minimize the number of scores in between.
Excluding outliers and giving students the benefit of a single off week during the semester is also a valid way of dealing with this kind of issue. Give the student a free pass for one clinker.
But also, these quizzes should also be a way to evaluate not only the students performance, but also the instructor's performance. If if the class as a whole has a dip in scores, then you really need to examine what happened during that week.
posted by KirkJobSluder at 11:22 AM on May 9, 2006
1: items must be carefully constructed in order to be treated as interval or ratio measures. In other words, is one point on this item equivalent to one point on that item? If you can't support this (and supporting this is hard, and more work than you probably want to do) then you really can't claim that the mean is the ideal or "most fair" measure of central tendency.
2: for skewed distributions, the median is frequently a more viable measure of central tendency than the mean. In many cases, you want a skewed or even bi-modal distribution in educational testing.
Much of it depends on whether you want to go with a norm-referenced test or a criterion-referenced test. If you want to go norm-referenced, then you find a way to get a nice and smooth distribution where you can fail a certain percentage of the population. If you want to go criterion-referenced, the ideal would be a bi-modal distribution with a hump of scores at the mastery level, and a hump of scores at the failure level, and minimize the number of scores in between.
Excluding outliers and giving students the benefit of a single off week during the semester is also a valid way of dealing with this kind of issue. Give the student a free pass for one clinker.
But also, these quizzes should also be a way to evaluate not only the students performance, but also the instructor's performance. If if the class as a whole has a dip in scores, then you really need to examine what happened during that week.
posted by KirkJobSluder at 11:22 AM on May 9, 2006
Or just to sum it up. Statistical measures are only as good as the data you put into it. If your data is less than ideal (and it almost never is), then you are free to choose the measure that is the least evil in your justified opinion.
posted by KirkJobSluder at 11:35 AM on May 9, 2006
posted by KirkJobSluder at 11:35 AM on May 9, 2006
I'm sorry. I'm going to have to call shenanigans on you not wanting to simply pass as many students as you possibly can.
The otherwise good students who bombed one quiz are saved from failing and the consistently underperforming ones will still be motivated to improve their scores.
If no quiz is worth more than 5 out of 45 points, then a good student couldn't possibly fail simply on the basis of one quiz score.
A student that averages 5 points for eight quizzes and gets one zero during the semester safely passes (88.89%). A student that averages 4 points for eight quizzes and gets one zero during the semester safely passes (71.11%). These are the good students, and they are in no danger of failing. In fact, any of the above mentioned students could get zeroes on two tests and still pass.
A student that averages 3 points for eight quizzes and gets one zero during the semester fails (53.33%). Any student of this caliber is consistently underperforming or worse. Your solution for getting this student to improve their scores is making it easier on them by dropping his worse grade?
posted by clearlynuts at 12:09 PM on May 9, 2006
The otherwise good students who bombed one quiz are saved from failing and the consistently underperforming ones will still be motivated to improve their scores.
If no quiz is worth more than 5 out of 45 points, then a good student couldn't possibly fail simply on the basis of one quiz score.
A student that averages 5 points for eight quizzes and gets one zero during the semester safely passes (88.89%). A student that averages 4 points for eight quizzes and gets one zero during the semester safely passes (71.11%). These are the good students, and they are in no danger of failing. In fact, any of the above mentioned students could get zeroes on two tests and still pass.
A student that averages 3 points for eight quizzes and gets one zero during the semester fails (53.33%). Any student of this caliber is consistently underperforming or worse. Your solution for getting this student to improve their scores is making it easier on them by dropping his worse grade?
posted by clearlynuts at 12:09 PM on May 9, 2006
Response by poster: KJS: "If you can't support this (and supporting this is hard, and more work than you probably want to do) then you really can't claim that the mean is the ideal or "most fair" measure of central tendency.
...for skewed distributions, the median is frequently a more viable measure of central tendency than the mean. In many cases, you want a skewed or even bi-modal distribution in educational testing."
This is interesting to me, but I don't completely understand. Ignoring what my eventual solution will be, would it have been better to say that a median score of 3 for the quizzes was necessary to pass rather than a percentage of 60% (considering the limited number of points)?
The quizzes basically ask similar questions about a new topic introduced each week so it seems like scores should be consistent across time. In fact, scores have generally improved from the first couple of weeks, potentially after the students learned what type of information they were responsible for.
posted by imposster at 1:09 PM on May 9, 2006
...for skewed distributions, the median is frequently a more viable measure of central tendency than the mean. In many cases, you want a skewed or even bi-modal distribution in educational testing."
This is interesting to me, but I don't completely understand. Ignoring what my eventual solution will be, would it have been better to say that a median score of 3 for the quizzes was necessary to pass rather than a percentage of 60% (considering the limited number of points)?
The quizzes basically ask similar questions about a new topic introduced each week so it seems like scores should be consistent across time. In fact, scores have generally improved from the first couple of weeks, potentially after the students learned what type of information they were responsible for.
posted by imposster at 1:09 PM on May 9, 2006
imposster: This is interesting to me, but I don't completely understand. Ignoring what my eventual solution will be, would it have been better to say that a median score of 3 for the quizzes was necessary to pass rather than a percentage of 60% (considering the limited number of points)?
To make a long story short: Educational assessment is a huge and very complex subject. The kinds of statistics used will vary by quite a bit depending on who, what, why, where and how the test is going to be performed. Unless you've invested a ton of work into proving that your quiz scores can be treated as parametric data, debates about the statistical rigor of mean vs. median are moot. You can't build a house on sand and all that.
The bottom line: IMHO, you can go either way depending on how you wish to set the standards and make an argument that it's "good enough."
posted by KirkJobSluder at 4:40 PM on May 9, 2006
To make a long story short: Educational assessment is a huge and very complex subject. The kinds of statistics used will vary by quite a bit depending on who, what, why, where and how the test is going to be performed. Unless you've invested a ton of work into proving that your quiz scores can be treated as parametric data, debates about the statistical rigor of mean vs. median are moot. You can't build a house on sand and all that.
The bottom line: IMHO, you can go either way depending on how you wish to set the standards and make an argument that it's "good enough."
posted by KirkJobSluder at 4:40 PM on May 9, 2006
This thread is closed to new comments.
posted by MeetMegan at 7:57 PM on May 8, 2006