# Anova p-value without tables

July 12, 2008 11:56 AM Subscribe

I'm trying to streamline my Mum's business processes and part of that is moving her statistical analysis online. Based on her notes I've created the anova table. Now I need to find the exact p-value where my F is significant

Its been 7 years since I did this at school so I don't remember all the exact terms but even then we just picked a significance level, looked up the value in the table and then said it was significant or not significant which isn't enough for what my mum does.

I've googled around all I can find are online calculators that will calculate it for you (and sadly none of them were javascript) so I know its possible and feasible but I just can't find any algorithms or methods described anywhere. The standard answer seems to be to look it up in a table or use some ready made software.

Its been 7 years since I did this at school so I don't remember all the exact terms but even then we just picked a significance level, looked up the value in the table and then said it was significant or not significant which isn't enough for what my mum does.

I've googled around all I can find are online calculators that will calculate it for you (and sadly none of them were javascript) so I know its possible and feasible but I just can't find any algorithms or methods described anywhere. The standard answer seems to be to look it up in a table or use some ready made software.

You haven't specified the significance level. Statistics aren't merely "significant" or "non-significant." You need to pre-specify the level of signficance that will satisfy you. An alpha of .05 is pretty standard, as is .01. You can find out what value F would need to take to be significant at each of those values using a standard F distribution table.

posted by proj at 12:23 PM on July 12, 2008

posted by proj at 12:23 PM on July 12, 2008

I might add here that if she's doing her analysis with any kind of statistical software at all it will tell you the significance level of the F-statistic when it produces it. Reporting the significance level of a statistic is enough for most academic journals, I can't imagine why that wouldn't "be enough" for your mom's business.

For instance, most journals report statistics in the following way, "F = 42.33 **" with one * indicating that the statistic is significant at the .05 level, two *s indicating significance at the .01 level, three *s at the .001 level, and so on. I don't know that (in the social sciences) I've ever seen "F = 42.33, F would be significant at 41.01, therefore reject the null and conclude significant a the .05 level" or something to that effect.

posted by proj at 12:29 PM on July 12, 2008

For instance, most journals report statistics in the following way, "F = 42.33 **" with one * indicating that the statistic is significant at the .05 level, two *s indicating significance at the .01 level, three *s at the .001 level, and so on. I don't know that (in the social sciences) I've ever seen "F = 42.33, F would be significant at 41.01, therefore reject the null and conclude significant a the .05 level" or something to that effect.

posted by proj at 12:29 PM on July 12, 2008

Best answer: The p-values that statistical software reports along with F statistics are computed by integrating the probability density function of what is known as the F distribution (which requires 2 parameters to be specified that depend on the nature of your data and the statistical test) from 0 up to the observed F-statistic, and subtracting that value from 1. The integral of the F distribution pdf does not have a simple closed-form solution, so numerical integration is required in practice. Numerical integration can be tricky to do properly and certainly beyond the abilities of a student with a textbook and a calculator, which is why in classes they have you look up values in a table.

I don't have a cite for the proper way to do the integration. I would recommend trying to find a library where this is already implemented rather than reinventing the wheel yourself.

posted by epugachev at 1:18 PM on July 12, 2008

I don't have a cite for the proper way to do the integration. I would recommend trying to find a library where this is already implemented rather than reinventing the wheel yourself.

posted by epugachev at 1:18 PM on July 12, 2008

Response by poster: Found a library at phpmaths.com - after looking at the source code I can see why all the articles/tutorials that include an exact p-value skip over that part and just magically come up with the number ;)

posted by missmagenta at 3:29 PM on July 12, 2008

posted by missmagenta at 3:29 PM on July 12, 2008

Your p value is in the table at http://www.hazelryan.co.uk/anova.php

F(df=1,4) = 37.5, p = 0.3 which means that the ANOVA is non significant and there is no difference between means (well variance actually) across conditions. Actually you seem to have shifted the decimal places over - I had a quick look in Excel with =fdist(37.5,1,4) which indicates that the p value is 0.003 which makes more sense given the size of the F statistic.

I have no idea what the chi squared test is for - your thinking sounds quite confused about this.

Once you have an idea of the hypotheses you want to test, you might want to look at R along with this document.

posted by singingfish at 3:34 PM on July 12, 2008

F(df=1,4) = 37.5, p = 0.3 which means that the ANOVA is non significant and there is no difference between means (well variance actually) across conditions. Actually you seem to have shifted the decimal places over - I had a quick look in Excel with =fdist(37.5,1,4) which indicates that the p value is 0.003 which makes more sense given the size of the F statistic.

I have no idea what the chi squared test is for - your thinking sounds quite confused about this.

Once you have an idea of the hypotheses you want to test, you might want to look at R along with this document.

posted by singingfish at 3:34 PM on July 12, 2008

Response by poster:

It is

posted by missmagenta at 4:05 PM on July 12, 2008

*Your p value is in the table at http://www.hazelryan.co.uk/anova.php*It is

*now*. Also the decimal place wasn't shifted the number you saw was just plain wrong (just coincidentally about 100x the real value) - you happened to look at it in the 30 seconds where I'd put the wrong numbers into the formula ;)posted by missmagenta at 4:05 PM on July 12, 2008

This thread is closed to new comments.

posted by missmagenta at 12:04 PM on July 12, 2008