Too poor to buy a stats package!
May 21, 2010 10:08 AM Subscribe
Free stand-alone stats programs for Mac and/or Windows that have (at a minimum) two-way ANOVA and linear regression capabilities? Barring that, good resources for learning R?
I'm a grad student working in a lab where most of our experiments are super-simple in design and require no test more complicated than a T-test (or a chi-squared test). Because I'm apparently the black sheep of the group, I set up several slightly more complicated experiments, all of which need to be analyzed using two-way ANOVAs. I have another tentative series of experiments in the pipeline that will require regression analysis.
Unfortunately, the only program on the lab's computers with any statistical functionality whatsoever is Excel. Extra-unfortunately, my advisor "upgraded" all the lab computers to Excel 2008, so I lost Excel's rudimentary statistical functionality with that switch. I don't mind doing t-tests and one-way ANOVAs/post-hoc tests by hand, but I'm probably going to screw up if I try to do four two-way ANOVAs and three linear regressions by hand. Not to mention that it will take a lot of time and effort doing what is essentially the sort of busywork for which computers were invented in the first place.
So... I'm trying to either find an easy-to-learn free program that will let me quickly and simply run my data analyses, or find good resources for learning R. I've been toying with the idea of learning R anyway, because there are existing packages for pretty much any analysis that I would ever need to do -- having no statistics package in my lab might just be the kick in the pants that I need.
Here are my limitations:
1) I don't have any money, at all, to spend on anything whatsoever. I know that I could get pretty much any commercially-available stats package from the campus computing center for cheap, but even a $5 license fee is too expensive for me right now. My lab is also broke.
2) Lab computers are split down the middle between PCs running Windows XP and Macs, running whatever the latest version of OS X is . I have a senile MacBook and a fairly recent Dell Latitude (the Dell is running Windows Vista, which came pre-loaded and which I have been too lazy to change). Software doesn't need to be available for all these systems, but it should work in at least one of them.
3) I have extremely limited programming experience. (I spent the time that I *should* have spent taking comp sci and/or math courses taking extra biology electives, where I learned how to count rats humping each other in a dimly-lit room.) I don't need to learn enough to write new extensions in R, but I do need to learn enough to understand how the language works, and be able to use the packages that other people have written.
I'm a grad student working in a lab where most of our experiments are super-simple in design and require no test more complicated than a T-test (or a chi-squared test). Because I'm apparently the black sheep of the group, I set up several slightly more complicated experiments, all of which need to be analyzed using two-way ANOVAs. I have another tentative series of experiments in the pipeline that will require regression analysis.
Unfortunately, the only program on the lab's computers with any statistical functionality whatsoever is Excel. Extra-unfortunately, my advisor "upgraded" all the lab computers to Excel 2008, so I lost Excel's rudimentary statistical functionality with that switch. I don't mind doing t-tests and one-way ANOVAs/post-hoc tests by hand, but I'm probably going to screw up if I try to do four two-way ANOVAs and three linear regressions by hand. Not to mention that it will take a lot of time and effort doing what is essentially the sort of busywork for which computers were invented in the first place.
So... I'm trying to either find an easy-to-learn free program that will let me quickly and simply run my data analyses, or find good resources for learning R. I've been toying with the idea of learning R anyway, because there are existing packages for pretty much any analysis that I would ever need to do -- having no statistics package in my lab might just be the kick in the pants that I need.
Here are my limitations:
1) I don't have any money, at all, to spend on anything whatsoever. I know that I could get pretty much any commercially-available stats package from the campus computing center for cheap, but even a $5 license fee is too expensive for me right now. My lab is also broke.
2) Lab computers are split down the middle between PCs running Windows XP and Macs, running whatever the latest version of OS X is . I have a senile MacBook and a fairly recent Dell Latitude (the Dell is running Windows Vista, which came pre-loaded and which I have been too lazy to change). Software doesn't need to be available for all these systems, but it should work in at least one of them.
3) I have extremely limited programming experience. (I spent the time that I *should* have spent taking comp sci and/or math courses taking extra biology electives, where I learned how to count rats humping each other in a dimly-lit room.) I don't need to learn enough to write new extensions in R, but I do need to learn enough to understand how the language works, and be able to use the packages that other people have written.
FYI: your problem with Excel may be that you need to have the Analyst Toolpak installed. See this for more info: http://www.google.com/search?q=excel+2007+anova&ie=UTF-8&oe=UTF-8&hl=en&client=safari
posted by dfriedman at 10:18 AM on May 21, 2010
posted by dfriedman at 10:18 AM on May 21, 2010
Response by poster: Excel 2007 will do what I want. Excel 2008 will not do what I want. Microsoft got rid of the VBA Analysis Toolpak when they released Office 2008.
posted by kataclysm at 10:20 AM on May 21, 2010
posted by kataclysm at 10:20 AM on May 21, 2010
Best answer: If you're in graduate school in the sciences, I'd recommend jumping into R. I'm sure there's a way to jury rig Excel into running 2-way ANOVAs (I think I took a class once where it was an assignment), but is there any downside to moving to a real stats platform? I can't imagine that having extra stats knowledge is ever a bad thing in the sciences.
As for learning R, there were some great answers regarding previous asks of that question.
posted by eisenkr at 10:21 AM on May 21, 2010 [1 favorite]
As for learning R, there were some great answers regarding previous asks of that question.
posted by eisenkr at 10:21 AM on May 21, 2010 [1 favorite]
There are third party addins for xl 08: http://www.officeformac.com/blog/Now-Available--Data-Analysis-For-Excel-2008
another thought: try open office if you want a spreadsheet environment.
posted by dfriedman at 10:23 AM on May 21, 2010
another thought: try open office if you want a spreadsheet environment.
posted by dfriedman at 10:23 AM on May 21, 2010
torrent spss 11. It'll do what you need with no need to learn progamming.
posted by k8t at 10:26 AM on May 21, 2010
posted by k8t at 10:26 AM on May 21, 2010
On the PC side the toolpac is built in - enabling add-ins is done through the Office Button | Excel Options | Add-Ins, which is where all Add-In management takes place. It is only the Mac's that something is different 2008(MAC) can have the same/similar tools as previous editions- you need to download a third party (ie: MS didn't want to deal with it anymore) nagware that should cover the basic analysis you are looking for. It's called StatPlus:mac LE (which preliminarily google searches found no working link) but I would just go with the PC version.
This is what R is for. It's free. You likely don't need special R packages for what you are doing (assumption!) so you should try it out - go with the PC version.
If you can, don't use Excel - it reportedly has problems (PDF) and perhaps more accessible here although MS does claim it has fixed the issue.
posted by zenon at 10:39 AM on May 21, 2010
This is what R is for. It's free. You likely don't need special R packages for what you are doing (assumption!) so you should try it out - go with the PC version.
If you can, don't use Excel - it reportedly has problems (PDF) and perhaps more accessible here although MS does claim it has fixed the issue.
posted by zenon at 10:39 AM on May 21, 2010
Best answer: I'm teaching myself R right now--fuck those overpriced licenses for other programs! It was a little bit wtf this is a whole new language I can't deal with this at first, but that goes away faster than you think. I rely on this guide and this one the most. I also keep a text document with a running list of my most frequently needed commands for easy reference.
posted by mandymanwasregistered at 10:42 AM on May 21, 2010 [3 favorites]
posted by mandymanwasregistered at 10:42 AM on May 21, 2010 [3 favorites]
Duuude use R! I promise it's super easy to get to the point where you will feel comfortable doing the kinds of tests you need. If you'd like, I can email you some of the material we used in my intro stats class. If you don't need to do anything very complicated, you don't really need a high-level understanding of R, just a few key commands that you can copy and paste every time you need them.
posted by MadamM at 11:20 AM on May 21, 2010
posted by MadamM at 11:20 AM on May 21, 2010
Also, in my opinion the R interface is much better on OSX than Windows.
posted by MadamM at 11:21 AM on May 21, 2010
posted by MadamM at 11:21 AM on May 21, 2010
Check out gretl it was the stats package that was used in my econometrics class last semester. It has the ANOVA model but don't know about the two way ANOVA model. Open source so its at least worth a look.
posted by thissideofdead at 11:22 AM on May 21, 2010
posted by thissideofdead at 11:22 AM on May 21, 2010
Best answer: For the R route, This is a direct link to a PDF of Harold Baayen's book Analyzing Linguistic Data which is the go-to guide around my researchworld. I am working my way through it now.
posted by knile at 11:23 AM on May 21, 2010
posted by knile at 11:23 AM on May 21, 2010
Best answer: Yes, R is going to be your friend on this one. Some quick R pointers:
The function lm (linear model) is going to be really useful for you. Regression, ANOVA, and ANCOVA are all basically just instances of linear models.
yourmodel <- lm(Y ~ a + b + c + 1) fits the linear model Y = Aa + Bb + Cc + intercept (the + 1 is optional, I think). Change the +1 to a -1 for no intercept. Type summary(yourmodel) to get coefficient estimates, t-statistics, and the associated p-values for each coefficient. If A, B, and C are continuous variables, you've just done a multiple regression; if they are factors you have an ANOVA; if you have some of each you have an ANCOVA. Bam!
For interactions, you would use the asterisk, e.g., lm(Y ~ a*b + c + 1), which fits a model with main effects for a, b, and c as well as an interaction effect for a and b. To get just the interaction and no main effects, use the colon: lm(Y ~ a:b + c + 1) would give you just the interaction between a and b with no main effects, and the main effect of c.
A few other useful functions:
read.table and read.csv for reading spreadsheets
write.table and write.csv for saving them
factor makes a vector of characters into a factor that you can use for lm
Here's a potentially helpful book. Previously there was an excellent FPP about R.
>
posted by en forme de poire at 11:35 PM on May 22, 2010 [1 favorite]
The function lm (linear model) is going to be really useful for you. Regression, ANOVA, and ANCOVA are all basically just instances of linear models.
yourmodel <- lm(Y ~ a + b + c + 1) fits the linear model Y = Aa + Bb + Cc + intercept (the + 1 is optional, I think). Change the +1 to a -1 for no intercept. Type summary(yourmodel) to get coefficient estimates, t-statistics, and the associated p-values for each coefficient. If A, B, and C are continuous variables, you've just done a multiple regression; if they are factors you have an ANOVA; if you have some of each you have an ANCOVA. Bam!
For interactions, you would use the asterisk, e.g., lm(Y ~ a*b + c + 1), which fits a model with main effects for a, b, and c as well as an interaction effect for a and b. To get just the interaction and no main effects, use the colon: lm(Y ~ a:b + c + 1) would give you just the interaction between a and b with no main effects, and the main effect of c.
A few other useful functions:
read.table and read.csv for reading spreadsheets
write.table and write.csv for saving them
factor makes a vector of characters into a factor that you can use for lm
Here's a potentially helpful book. Previously there was an excellent FPP about R.
>
posted by en forme de poire at 11:35 PM on May 22, 2010 [1 favorite]
This thread is closed to new comments.
I believe, but am not certain, that it also does ANOVA.
Further I know scientists and statisticians don't like Excel's statistical analysis tools for reasons that seem fairly abstruse to me.
Short answer: Excel will do what you want.
Long answer: There's probably something better out there for your uses. Have you experimented with Wolfram Alpha's web site?
posted by dfriedman at 10:16 AM on May 21, 2010