Two-way ANOVA and Type lll SoS analysis
May 25, 2020 10:38 PM   Subscribe

[Explain like I'm five filter] I'm currently doing a sensory evaluation assignment for uni with imaginary data (because COVID-19) but I am struggling with the data analysis. We had to download the trial version of XLSTAT, and follow minimal instructions including what outputs to report. Got through all that but now I'm left with results that I have to discuss that I do not understand. I Googled the issue and it all goes well above my head. I emailed the lecturer and got told to look at the course material. I can tell you, it didn't help. I would love it if someone could explain the difference between two-way ANOVA and Type lll analysis, or could point me to a resource that explains it. More detail within!

So, I have three different attributes I am analysing. I did separate one-way ANOVA for consumer acceptance of the three different attributes. All returned a significant p-value. Did two-way ANOVA, plus Type lll Sum of Squares analysis for the interaction of age and gender on acceptance of these three different attributes. All three two-way ANOVA tables give me p-values of <0.0001 for all three attributes. However, the Type lll tables give me something quite different. with only one of these p-values coming in under the alpha threshold (interaction of age on one particular attribute). I understand this has something to do with main effects, but I don't know what that really means. I have been told by a friend that we are supposed to look at the Type lll results for reporting significance but she couldn't explain why, because she'd been told that by the lecturer without any context or explanation. I assume it's because when you look at the ANOVA it's clearly not specific to the interaction, whereas the Type lll analysis is interaction specific.

I only have a very basic grasp of statistics, to be very honest with you. Previous stats papers didn't cover this stuff. Under normal circumstances, we would have been heavily guided through this in our lab sessions but with everything being 100% distance learning right now, there's not a lot of support for interpreting data.

Thank you in advance!
posted by BeeJiddy to Science & Nature (8 answers total) 2 users marked this as a favorite
 
Are you absolutely sure you've got the right input variables for the ANOVA analysis? When I've run ANOVA before (with different software, sorry) it's separated out the impact of each individual factor and of the two together. So rather than "model: p<.001" I've gotten results like "age p<.1, gender p<.05, age x gender p < .5". Because they subtract out the main effects before looking for crossed effects.
posted by Lady Li at 12:08 AM on May 26, 2020


I would suggest you look for a tutorial on using XLSTAT for ANOVA two-way, because I strongly suspect you're supposed to be doing a correction for the primary effect of age and gender first, before you do the analysis for the "intersection". Otherwise your anova data right now looks like it's answering the question "do age and gender have predictive value?" *Including* the primary effects.
posted by Lady Li at 12:14 AM on May 26, 2020 [1 favorite]


Response by poster: Yes, I understand exactly what you mean (well, mostly) because when I google the issue I notice that my tables don't look right compared to what I find for other two-way ANOVA analyses. I do know that my tables look exactly like those of my cohort though, so it would be an issue with the instructions (they were terrible and for an older version of the software) and not necessary with what I've personally done. In any case, my free trial has expired so even if I wanted to I couldn't go back and fix them.

So I guess the problem is just that the ANOVA wasn't properly done to give me the right information, so as they stand, there is no strong connection between them and the Type lll tables. If the ultimate answer is that the analysis is borked, that's fine. The lecturer hasn't been receptive to feedback regarding the instructions, so it be what it be.
posted by BeeJiddy at 12:55 AM on May 26, 2020


It's been a while since I did this sort of analysis, but I found this from the University of Toronto which pretty much agrees with the comments above.

The Type III Sum of squares is the more demanding test hence the less definitive results.
posted by SemiSalt at 6:05 AM on May 26, 2020


I'm a little confused. To be fair, I don't use this software, but I don't see anywhere where you have a test of the interaction between age and gender.

Type I, II, and III ANOVA/ANCOVA are all just different ways of calculating an ANOVA with multiple predictors. Type III means you are controlling for each other factor when you calculate the statistic for a given factor. This means that the order you enter the variables into the equation doesn't matter. In this case, Type III would give you the effects of age controlling for gender, and the effects of gender controlling for age, and then the interaction of age and gender (e.g., does the effect of age on your outcome variable differ by gender). Type I ANOVA adds sequentially, so it depends on what order your put your variables in. For example, it would calculate the effect of age controlling for gender, but not the other way around. Which you use depends on your research question, though in many fields Type III is generally used by convention.

What it looks like to me that you've done here is that the ANOVA table as you've shared it is giving you an omnibus statistic. It means there is something significant in your model, but doesn't tell you what it is.

I'm confused about what you are calling your Type III tables. Because you say you are testing the interaction of gender and age, but what I see here is a separate interaction for age and chocolate milk sample, and then an interaction between gender and chocolate milk sample.

What your table would normally look like is a line for age (main effect), a line for gender (main effect) and a line for the interaction of age and gender, which typically appears as age*gender.

It's fine to run three different models for three outcomes, but I would double check how you've input this into the software. Because what I think you want to know is: 1) does age affect overall/flavour/odour liking; 2) does gender affect overall/flavour/odour liking; 3) is the effect of age on overall/flavour/odour liking different between men and women. Is that right? If so, I don't think your models are formatted correctly.
posted by Lutoslawski at 9:15 AM on May 26, 2020 [1 favorite]


I would agree with everything Lutoslawski and others above have said.

But I would add something else to think about:

---> You have outlined the data and shown results from some various ANOVA runs, but you have not clearly identified to us what question(s) the assignment is asking you to address.

If you could share the exact question asked, or what the teacher is expecting you to analyze, and/or the data tables I think we could work through what is going on & what is going wrong.

I believe what is going wrong is you are identifying the wrong variables/categories as dependent & independent variables. As Lady Li said, I strongly believe you are supposed to be doing an analysis of AGE and GENDER and then AGE*GENDER. So in your Type III results table (instructions/example here) the rows you see should be:
  • Age
  • Gender
  • Age*Gender
Another possibility is the type of test is supposed to be in the mix as well (overall, odor, taste) and in that case you would have something like these rows in the Type III results table:
  • Test Type (overall, odor, taste)
  • Age
  • Gender
  • Test Type*Age
  • Test Type*Gender
  • Age*Gender
Now as to your actual question: How to explain or discuss your results.

First off, your results are quite confused and don't seem to match up with each other or what we expect to see. That we why we are not just going right into "Here is what your results mean." Because they are just confusing and don't quite make sense.

But in general: The tables you are showing so far show whether there is anything of statistical significance in the results.

So if there is no statistical significance you can pretty much just say that. "Table D2 indicates there is no significant difference in the flavour liking test by age or by gender." etc.

But if there IS an area where significant statistical difference pops up, then the ANOVA tables such as you have shown are not the final step! They only tell you, "Yup, there is something interesting to look for here!" Now you have to go looking for it.

Basically how you go looking is you figure out the averages of the scores for each group (where a statistically significant different was found). For example it looks like there is a difference in the odor scores by age. So when you look at that maybe you will find that 20-30 year olds gave higher scores on average than 60-70 year olds, or maybe there is a general trend by age (young people give higher scores than older people, or the reverse).

So you can and probably should do a little generalized discussion of whatever you find from those mean scores by age.

But there is another technical step here, that your teacher (probably?!) covered--post-hoc testing. Basically that is that your Type III tables tell you XYZ category had a significant result! Well hooray, but let's say XYZ Category is age and maybe there are 10 different age ranges that we measured. So where does the actual difference between those lie?

Maybe 20-30, 31-40, 41-50 age groups are all basically the same. But 51-60 is significantly higher and 61-100 is significantly higher yet.

Your post-hoc tests will give you t his information: Exactly which groups are significantly different from which others.

So (assuming you have covered this in class) the usual thing would be to go on and show the post-hoc analysis table for any significant results and then discuss the table.

Another typical thing is to then show a table with the Name, N, Mean, and Standard Deviation for each group within this category.

That gives you the data you need to launch into a discussion of which groups (age groups in your case) have higher or lower results, any trends you see, etc.

This will all make a lot more sense if you read through this article on Statistics by Jim regarding the use of post-hoc test results.

If none of the above about "post-hoc tests" rings a bell at all then I might just present a table showing mean & standard deviation for the statistically significant categories (ie, mean & stdev of odor scored by age group) and then discuss as I outlined above.

Helpful resources:

- How to do One-Way ANOVA, Statistics by Jim
- How to do two-way ANOVA, Statistics by Jim
- How to do Post-Hoc Tests
- Two-Way ANOVA chapter from Howard Seltman's statistics Book. He uses the SPSS stats package rather than Excel, so the tables etc look just a little different. But he works through a couple of examples very similar to yours.
posted by flug at 7:41 PM on May 26, 2020


Also possibly helpful:

* You can do two-way ANOVA in plain old Excel, no extra pay packages needed. Two-way ANOVA chapter in Statistics by Jim tells you how to load what you need & generally how to do it. I would strongly suggest working through the one-way, two-way, and post-hoc ANOVA examples on Jim's pages; that will help you get a grasp of how the whole system works. They try doing your own examples by following Jim's examples & steps but substituting in your data.

* You can also do two-way ANOVA calculations in LibreOffice (a completely free MS Office clone). Example/instructions.

* You can also do two-way ANOVA calculations using Google Sheets. Instructions.

I mention those options because it seems like you need to play around with the tables and calculations to get them right. Above are three different ways to do so.
posted by flug at 7:47 PM on May 26, 2020


Response by poster: Thank you so much, everyone, for the help. I will comb through this tonight and see what I can do.

The Type lll tables are exactly what XLSTAT spat out, with the heading 'Type lll Sum of Squares analysis' so if that's wrong as well, then I don't even know what to say about that, hah. I guess this is the issue with being given instructions to do something, without actually being told what it all means.

The hypotheses we are testing are if there are significant differences between the samples for overall liking, flavour liking, odour liking (that was done with one-way ANOVA plus Tukey's HSD and that seems to have gone fine, which is why I didn't ask about that in this question), then if age interacts with those three attributes, then if gender interacts with those three attributes. Sorry if I confused anyone, I myself am very confused about this assignment. I do understand what you're all getting at though. It's a mess, basically.
posted by BeeJiddy at 11:07 PM on May 26, 2020


« Older Looking for a good copyediting recruiter for...   |   How can I keyword search for JUST hit pop song... Newer »
This thread is closed to new comments.