# Help me explain nominal, ordinal, and interval data, and means and modes

January 5, 2017 7:51 AM Subscribe

I'm on a project that is collecting ordinal data through surveys - for instance 'How satisfied are you with product X on a scale of 1-10?' We are analyzing and producing charts etc. using the mean as the reporting unit. I think this is wrong, and we should be using the mode. So: What are accessible and authoritative sources that I can read up on and also point to that show why we should be using the mode and not the mean?

I'd like to explain all of this in authoritative terms, but I don't have the background. We have also have a number of occurrences of the mode consistently being higher than the mean. I think this means that the distribution is skewed towards the higher responses, right? - Again, any "Idiots Guide to ..." resources that look at this would be most welcome. Thank you! [previous]

I'd like to explain all of this in authoritative terms, but I don't have the background. We have also have a number of occurrences of the mode consistently being higher than the mean. I think this means that the distribution is skewed towards the higher responses, right? - Again, any "Idiots Guide to ..." resources that look at this would be most welcome. Thank you! [previous]

From a technical standpoint, ordinal data doesn't have a mean, just a mode and median. In practice, though, many people do use the mean. Even more confusing, some argue that it should not be analyzed with parametric statistics, but others have argued pretty decisively that it should. Here is one article about using this kind of information in medical research that sums things up.

In the end, it comes down to what question you're trying to answer with the data. That will guide how to best analyze it.

posted by goggie at 8:43 AM on January 5, 2017

In the end, it comes down to what question you're trying to answer with the data. That will guide how to best analyze it.

posted by goggie at 8:43 AM on January 5, 2017

FWIW, "mode higher than the mean" could mean a bunch of different things:

posted by nebulawindphone at 8:48 AM on January 5, 2017 [6 favorites]

- Most people are reasonably happy and give medium or high ratings, but a few pissed-off assholes give extremely low ratings that pull the mean down.
- Everyone either loves it (10) or hates it (1) but the 10s slightly outnumber the 1s.
- Everyone's completely indifferent. It just so happens that one or two people expressed their indifference with a rating of 4 and the rest expressed it with a rating of 5, but this is a total coincidence and could easily have gone the other way.
- Everyone thinks your survey is a waste of time and people are choosing answers "at random," but people suck at "random" and when they try to pick random numbers they pick 7 disproportionately often, so hey what do you know you've got a mean of 5.5 or just above and a mode of 7.
- This is a situation where people are feeling pressure to be nice or agreeable, or trying not to make trouble for someone else, and so they're cramming their answers into the very top end of your rating scale and you're running into a ceiling effect.
- Probably a bunch of other stuff, this is just examples off the top of my head.

*your*data by looking at actual histograms of the answers on specific questions.posted by nebulawindphone at 8:48 AM on January 5, 2017 [6 favorites]

Here's a perspective from a computer scientist who got much more involved with puzzling out good statistical methods after graduation.

I don't think there is any perfect single reporting value for any sample. I share your suspicion for using the mean for ordinal data, but the fact is that lots of people do it. That doesn't mean it's the Right way, of course, but if you are fighting against someone else who is taking that position, they could come up with their own sources.

Stevens wrote the work that is most often referred to and he takes a cautious position:

posted by demiurge at 8:49 AM on January 5, 2017 [3 favorites]

I don't think there is any perfect single reporting value for any sample. I share your suspicion for using the mean for ordinal data, but the fact is that lots of people do it. That doesn't mean it's the Right way, of course, but if you are fighting against someone else who is taking that position, they could come up with their own sources.

Stevens wrote the work that is most often referred to and he takes a cautious position:

As a matter of fact, most of the scales used widely and effectively by psychologists are ordinal scales. In the strictest propriety the ordinary statistics involving means and standard deviations ought not to be used with these scales, for these statistics imply a knowledge of something more than the relative rank-order of data. On the other hand, for this 'illegal' statisticizing there can be invoked a kind of pragmatic sanction: In numerous instances it leads to fruitful results.My advice is to visualize your data, visualize the distributions for each question if you can, and take note of any oddities, like bimodal distributions. If all the distributions look normal, well, taking the mean is probably not too bad. A further problem is interpretation of comparison between means. Even if you see Product Y has a higher mean on the satisfaction question than Product X, that doesn't straightforwardly mean "People liked Product Y better than Product X".

posted by demiurge at 8:49 AM on January 5, 2017 [3 favorites]

If your parent or grandparent got in an uber and had a fine ride, they might think that a 3 star review is perfectly fine since the ride was just average (and if you took the mean of the 1-5 scale 3 is a reasonable approximation of average).

Of course, we know that Uber boots drivers with under 4.5s (?) since they expect every rider to rate every ride a 5 unless there are problems.

If the 3 got averaged (mean'd) into a drivers daily total it would substantially over-weight the one rider who just did not get the scale (since they assumed that each jump was equal from 1-2 and 3-4 etc, when the actual scale is designed to have 4 and 5 much closer to each other and 1-3 are all equally trash). Taking the mode would be a better/fairer approach in this situation.

posted by Exceptional_Hubris at 8:53 AM on January 5, 2017 [1 favorite]

Of course, we know that Uber boots drivers with under 4.5s (?) since they expect every rider to rate every ride a 5 unless there are problems.

If the 3 got averaged (mean'd) into a drivers daily total it would substantially over-weight the one rider who just did not get the scale (since they assumed that each jump was equal from 1-2 and 3-4 etc, when the actual scale is designed to have 4 and 5 much closer to each other and 1-3 are all equally trash). Taking the mode would be a better/fairer approach in this situation.

posted by Exceptional_Hubris at 8:53 AM on January 5, 2017 [1 favorite]

*Everyone either loves it (10) or hates it (1) but the 10s slightly outnumber the 1s.*

Just to be expand on this, this case is where there is more than one mode. It is rather common with ordinal scale surveys, and means that reporting "the" mode is actually a pretty terrible descriptive statistic to choose as the only one, in this family median is much better. But you

*definitely*need to look at a histogram in order how to describe the raw results, and in general to not assume that any single descriptive statistic is going to tell the whole story.

(As is typical in this area, if you really want to know how analysis of ordinal data ought to be done, the answer is going to be a lot more complicated than you expected. You need to do something about the fact that participants will use the scale in different ways (normalization), the order of questions may matter, etc. My group typically uses mixed effects models with cumulative link functions via the R package ordinal as a starting point, these days. I'm unfortunately not aware of any basic tutorial or primer that will quickly bring you up to speed on doing this, but here's a tutorial that assumes familiarity with mixed effects models. This isn't _necessarily_ exactly right for what you're doing, and probably isn't easy to present in a corporate setting, but it should give you some idea of how non-trivial interpreting ordinal survey data is.)

posted by advil at 9:27 AM on January 5, 2017 [2 favorites]

A text on non-parametric statistics might be help.

Mean, median, and mode are called estimates of location, meaning where on the number line are the values located. Standard deviation is a measure of dispersion, so of how wide is the bell curve.

In the case where all the answers are either 0 or 10, none of these if very helpful. I don't know if there is a statistic that fits the bill. Standard deviation might work, but it takes a bit of insight. For example, SD=5.2 means that all answers are 0 or 10, and there are an even number of each.

You could make up your own test of polarity. For example, the degree of polarity might be the number of points at each end have, say, 80% of the values. So if 0 & 10 have 80% of the values the polarity index is 1. If 0,1,9,10 have 80%, then the index is 2. Use labels like Highly Polarized down to Not Polarized.

posted by SemiSalt at 9:58 AM on January 5, 2017

Mean, median, and mode are called estimates of location, meaning where on the number line are the values located. Standard deviation is a measure of dispersion, so of how wide is the bell curve.

In the case where all the answers are either 0 or 10, none of these if very helpful. I don't know if there is a statistic that fits the bill. Standard deviation might work, but it takes a bit of insight. For example, SD=5.2 means that all answers are 0 or 10, and there are an even number of each.

You could make up your own test of polarity. For example, the degree of polarity might be the number of points at each end have, say, 80% of the values. So if 0 & 10 have 80% of the values the polarity index is 1. If 0,1,9,10 have 80%, then the index is 2. Use labels like Highly Polarized down to Not Polarized.

posted by SemiSalt at 9:58 AM on January 5, 2017

I would actually just graph the data points in this case so you can look at the distribution. I don't think the mode is particularly more informative than the mean here -- imagine a case where 51% of responses say their satisfaction is a 10 and 49% of responses say their satisfaction is a 1. If you go with the mode, you'd be reporting satisfaction as a 10, but obviously your business probably has some problems it needs to solve if almost half the people are extremely dissatisfied! So, reporting as 10 would be highly misleading.

If you MUST go with one summary number, I think the median would be the way to go (less influence from outliers), but really the best way to report would, I think, just be to do a histogram (bar graph) of all the data.

Another way to look at it might be to decide what levels of satisfaction are "acceptable" to you as a business -- perhaps you're happy if someone rates you as an 8 or above. Then report what percentage of responses fall in that acceptable category, so: "In 2015 only 45% of respondents rated us as 8 or above, while in 2016 we improved so that 60% of responses rated us as 8 or above." etc.

posted by rainbowbrite at 10:15 AM on January 5, 2017

If you MUST go with one summary number, I think the median would be the way to go (less influence from outliers), but really the best way to report would, I think, just be to do a histogram (bar graph) of all the data.

Another way to look at it might be to decide what levels of satisfaction are "acceptable" to you as a business -- perhaps you're happy if someone rates you as an 8 or above. Then report what percentage of responses fall in that acceptable category, so: "In 2015 only 45% of respondents rated us as 8 or above, while in 2016 we improved so that 60% of responses rated us as 8 or above." etc.

posted by rainbowbrite at 10:15 AM on January 5, 2017

The book

posted by Michele in California at 11:10 AM on January 5, 2017

*How to lie with statistics*is a great resource that you might find helpful.posted by Michele in California at 11:10 AM on January 5, 2017

One vote against mode. Here I'd group the data and use proportions: 64% rated an 8 or higher, 90% 5 or higher, 5% rated 3 or lower.

posted by Valancy Rachel at 3:49 PM on January 5, 2017 [1 favorite]

posted by Valancy Rachel at 3:49 PM on January 5, 2017 [1 favorite]

As the answers pretty much make clear, there's not going to be an authoritative source that backs you up because at best the mode would be better only occasionally.

Statistics are a summary of the data. The simplify the world to describe. The best description depends on both the question you're asking and the data. Describing something with statistics isn't actually objective, anymore than summarizing the plot of a book in a paragraph is.

With a set like this my first instinct would be to (for each set of data you want to compare) look at the mean, median, top 80%, and something Valancy Rachel's cutoffs (say, 8+, 6+ and 3+). Between these your getting several measurements of "typical" customers responses and how many people are enthusiastic in one direction or another. (I'd actually disagree that median is always better than mean for ordinal data, and could describe some data sets that work better for one or the other.)

If these all give roughly the same answer when you're doing comparisons--ie, the top set is always the top set--it's a pretty robust analysis and won't matter that much what your team uses. If they give different answers (i.e., product A has a higher average but more strongly negative compared to set B) you need to know what people are trying to determine from the data to communicate this fairly. You can also look more closely at the sets to see what drives that difference and understand it.

The only problem is that when you approach it this way you obviously have a lot of leeway in what you conclude, so be aware that it's easier to fool yourself.

posted by mark k at 9:11 PM on January 5, 2017

Statistics are a summary of the data. The simplify the world to describe. The best description depends on both the question you're asking and the data. Describing something with statistics isn't actually objective, anymore than summarizing the plot of a book in a paragraph is.

With a set like this my first instinct would be to (for each set of data you want to compare) look at the mean, median, top 80%, and something Valancy Rachel's cutoffs (say, 8+, 6+ and 3+). Between these your getting several measurements of "typical" customers responses and how many people are enthusiastic in one direction or another. (I'd actually disagree that median is always better than mean for ordinal data, and could describe some data sets that work better for one or the other.)

If these all give roughly the same answer when you're doing comparisons--ie, the top set is always the top set--it's a pretty robust analysis and won't matter that much what your team uses. If they give different answers (i.e., product A has a higher average but more strongly negative compared to set B) you need to know what people are trying to determine from the data to communicate this fairly. You can also look more closely at the sets to see what drives that difference and understand it.

The only problem is that when you approach it this way you obviously have a lot of leeway in what you conclude, so be aware that it's easier to fool yourself.

posted by mark k at 9:11 PM on January 5, 2017

This thread is closed to new comments.

posted by theodolite at 8:43 AM on January 5, 2017