# What is normal?May 4, 2010 11:01 AM   Subscribe

Statistics-filter: Two relatively (I think) simple statistics questions.

It's been a while since my last statistics class, and I have a couple of questions I need to get my head around. Dealing here with a small sample of participants and data collected through a survey with likert scale items.

For a small (n<30) sample like this, what would be a good test for normality to illustrate that the data are appropriately distributed for a t test? Since the sample is so small, it seems that just laying out scores and comparing them to a normal histogram is not the ideal method. What would you recommend? I'm sort of losing myself in reading about the more formal tests right now, and need a quick cue to which ones may or may not be appropriate for me. I don't have my data yet, just planning ahead.

Second question, for the same small n study: Is there a particular test I should be aware of for comparing changes in a single participant's scores across sets of identical surveys administered at different times? One set is would be a pre and post test, another would be four interstitial surveys with different items than the pre-post.

posted by activitystory to Science & Nature (8 answers total) 3 users marked this as a favorite

Likert scales are unfortunately by definition not normally distributed, and the greater the variance in your data set, and the closer the central tendency is to the edges of the scale, the more likely you are to violate the assumption of normality in the t-test. That doesn't mean it's entirely inappropriate on a case by case basis (t-tests are used all the time for this sort of data, for better or worse), and one way to assess for that is to test the skewness of your data. This can be done by just looking at it qualitatively or testing it formally in a statistics package. It does become problematic though for small sample size, and the best way to handle small samples really is to look at the data and do some descriptive statistics to figure out how the data is distributed. For example if everyone answers either a 2 or 3 on the scale, maybe a Fisher's exact test might be more appropriate.

Testing for changes in the score can be performed with a paired t-test for two points. With scale data this too can lead to violated assumptions, but differences tend to be less skewed in general than absolute scores. For serial data, you're probably going to need a more complex approach (generalized estimating equations or mixed effects models).
posted by drpynchon at 11:18 AM on May 4, 2010

Why not use a non-parametric method?
posted by benzenedream at 11:29 AM on May 4, 2010 [3 favorites]

After checkin' out some descriptives and plots, I'd do me a nested RM-ANOVA if it didn't look so screwy that I couldn't.
posted by solipsophistocracy at 12:06 PM on May 4, 2010

For question 1, I think what you want is a Normal Quantile Plot. It is tedious to do by hand but quite easy with software. This site gives you a step-by-step method for using Excel to calculate this plot. You can even do a linear regression to get an r value that quantifies how close to normal your sample is.

I don't have a good answer for question 2.
posted by El_Marto at 1:41 PM on May 4, 2010

There is a little known trick to look for violations of assumptions for a t test/anova for the straightforward case anyway. It's done empirically. El_Marto's explanation is fine too. Here's the rationale:

When you do a t test or ANOVA you treat the data as if it is continuous. That is the distance between a value of 1 and 2 is the same as the distance between 2 and 3 and so on throught the scale. For a non-parametric test, no such assumption exists, so you discard the data about scale, and only keep the data about order. This results in information loss, which when the data is non-parametric is justifiable, when it isn't it isn't.

So given that, and how the mann-whitney test or Kruskal wallis test are essentially equivalent tests for ordinal data as the t test/ANOVA are for continuous data, we expect that due to the information loss the non-parametric tests have less power.

The upshot of this is that if your data is parametric, the p value for the t test will always be lower than the p value for the mann-whitney test. However if it is not, the reverse is true as you have violated the assumptions of the t test too much. This is because if the data is parametric, the resultant information loss will decrease the statistical power available to you. If it's non-parametric, then it has no effect, or the spurious use of the continuous data will decrease the statistical power.
posted by singingfish at 3:47 PM on May 4, 2010 [1 favorite]

I assume that you mean some composite score instead of a single item. What you want is called a QQ plot or normal quantile plot. Googling those terms and your stat package of choice will find an implementation. You want to look at such a plot of the residuals from your analysis and not the raw data. Imagine that one group was exposed to something with a big effect but otherwise the same as the other group; the raw data would be bimodal and not normal looking at all. Formal tests of normality are available, and most of them have a geometric interpretation about the difference in the empirical and expected cumulative distribution plot.

I almost always ask people to do linear regression presentation instead of ANOVA presentation. They are almost the same model, and regression type output is way easier for most people to understand. For a repeated measures analysis, you can put a fixed effect on person and before /after or number in sequence. You can also use a random effect analysis, which is similar but places some restrictions on the "person" effects. Doing these in stata, SAS, and R isn't too hard, but I don't know about other packages.
posted by a robot made out of meat at 6:59 PM on May 4, 2010

the p value for the t test will always be lower than the p value for the mann-whitney test.

Thanks for that, Singingfish, I would never have considered that!
posted by Sutekh at 5:18 PM on May 7, 2010

The t-test is extremely robust to violations of the implicit assumptions (there's a nice quasi-likelihood literature which explains this), but be careful about p-value shopping because some parametric methods do give over-optimistic results when their assumptions are violated.
posted by a robot made out of meat at 11:58 AM on May 12, 2010

« Older Exile in FarmVille   |   Horrifying doll-house murder Newer »