Statistics for softies
January 4, 2018 6:07 PM Subscribe
I want to learn about statistics but have no idea where to start, and am very easily overwhelmed.
I've worked as a research assistant in a neuroimaging lab for the past year and a half. From early on it's been painfully obvious that I've never taken a course in statistics and it HURTS. I found the perfect course, but now it's most likely going to be cancelled due to low enrollment. Can you help me figure out how to approach this within my very picky parameters?
My background:
I studied computer science in undergrad, hated it, graduated, and ended up doing web development. I now do some scripting and programming at my job but nothing that feels very deep. The only classes I've taken in the past 8 years have been art classes.
The course I wanted to take was perfect in every way: One night a week for 3 hours, free for me to take using my employment benefits, and seemed to check every box I want for a stats course.
Here is what I think I want:
I want to do things in R. I don't care about theory or proofs or solving equations or doing things with formulas (unless I really need to in order to understand things - do I?) I want to know what various things mean and why I want to do them and what it means when I do them to data. I'd like to know what to do to answer questions about data that I have. I'd like to get advanced - I hear a lot of stats-related stuff at work and don't have any sense of what is at what level. I'd like to be able to do my own analyses and know why I am doing them and what they mean, and understand other people's analyses and why they did them and what they mean. Ideally I'd like to know what to do with all this data I have access to, to be able to ask questions and answer them with statistics.
Problems:
The community college is inconveniently timed, inconvenient to get to and would cost me money so I'm not going to do that. I'm not sure I'd be consistently self-motivated enough for a regular online course. I might want a book to work through but I've never exactly been thrilled to do heavy textbook reading. I get stressed out really easily at not knowing things or feeling "behind" at things. I'm in tears writing this question because I am overwhelmed and not feeling good about my ability to learn things (even signing up for the original course was a big step for me).
Questions:
What do I do? Do I want a particular book? an online course? which one? What do I need to know? Budget is, say, $500. I'm really more limited by my time, overwhelmed-ness, and the potential to get distracted by other things.
Thanks in advance.
I've worked as a research assistant in a neuroimaging lab for the past year and a half. From early on it's been painfully obvious that I've never taken a course in statistics and it HURTS. I found the perfect course, but now it's most likely going to be cancelled due to low enrollment. Can you help me figure out how to approach this within my very picky parameters?
My background:
I studied computer science in undergrad, hated it, graduated, and ended up doing web development. I now do some scripting and programming at my job but nothing that feels very deep. The only classes I've taken in the past 8 years have been art classes.
The course I wanted to take was perfect in every way: One night a week for 3 hours, free for me to take using my employment benefits, and seemed to check every box I want for a stats course.
Here is what I think I want:
I want to do things in R. I don't care about theory or proofs or solving equations or doing things with formulas (unless I really need to in order to understand things - do I?) I want to know what various things mean and why I want to do them and what it means when I do them to data. I'd like to know what to do to answer questions about data that I have. I'd like to get advanced - I hear a lot of stats-related stuff at work and don't have any sense of what is at what level. I'd like to be able to do my own analyses and know why I am doing them and what they mean, and understand other people's analyses and why they did them and what they mean. Ideally I'd like to know what to do with all this data I have access to, to be able to ask questions and answer them with statistics.
Problems:
The community college is inconveniently timed, inconvenient to get to and would cost me money so I'm not going to do that. I'm not sure I'd be consistently self-motivated enough for a regular online course. I might want a book to work through but I've never exactly been thrilled to do heavy textbook reading. I get stressed out really easily at not knowing things or feeling "behind" at things. I'm in tears writing this question because I am overwhelmed and not feeling good about my ability to learn things (even signing up for the original course was a big step for me).
Questions:
What do I do? Do I want a particular book? an online course? which one? What do I need to know? Budget is, say, $500. I'm really more limited by my time, overwhelmed-ness, and the potential to get distracted by other things.
Thanks in advance.
I want to do things in R.
I don't care about theory or proofs or solving equations or doing things with formulas (unless I really need to in order to understand things - do I?)
If you want to know whether you're doing the correct things in R and have any hope of detecting errors, yes, you do.
I want to know what various things mean and why I want to do them and what it means when I do them to data. I'd like to know what to do to answer questions about data that I have.
And these goals definitely require theory and doing things with formulas (you can skip the proofs I guess but the formulas won't make much sense if you do).
Cartoon Guide To Statistics is a decent resource.
posted by PMdixon at 6:52 PM on January 4, 2018 [5 favorites]
I don't care about theory or proofs or solving equations or doing things with formulas (unless I really need to in order to understand things - do I?)
If you want to know whether you're doing the correct things in R and have any hope of detecting errors, yes, you do.
I want to know what various things mean and why I want to do them and what it means when I do them to data. I'd like to know what to do to answer questions about data that I have.
And these goals definitely require theory and doing things with formulas (you can skip the proofs I guess but the formulas won't make much sense if you do).
Cartoon Guide To Statistics is a decent resource.
posted by PMdixon at 6:52 PM on January 4, 2018 [5 favorites]
I'd recommend Introductory Econometrics: A Modern Approach (Wooldridge), if you don't mind that most of the examples are social science related. I like it because 1) it starts from square one, 2) explains things in English instead of math proofs and linear algebra, 3) walks through a lot of examples with real-world data, 4) provides that data free online in files for R, Stats, Excel, etc, and 5) covers all the bases including cross sections, time series, and panel data.
However, you might have better luck with a series of lectures on Udemy, Coursera, or some of the MIT courses (sorry I don't have any specific recommendations). For me, statistics doesn't always "click" naturally especially when studying stuff the first time around, and having an actual lecture to follow and visualize things helps.
posted by hexaflexagon at 7:15 PM on January 4, 2018
However, you might have better luck with a series of lectures on Udemy, Coursera, or some of the MIT courses (sorry I don't have any specific recommendations). For me, statistics doesn't always "click" naturally especially when studying stuff the first time around, and having an actual lecture to follow and visualize things helps.
posted by hexaflexagon at 7:15 PM on January 4, 2018
Response by poster: I'm less concerned with learning R than I am with learning statistics. I want to understand what functions in R do but actually knowing R stuff is secondary.
Do I need to learn linear algebra to do what I want to do?
posted by ghostbikes at 7:36 PM on January 4, 2018
Do I need to learn linear algebra to do what I want to do?
posted by ghostbikes at 7:36 PM on January 4, 2018
Unfortunately, I don't think I can offer help in any immediate way, but I can try to help you whittle down what you want to something you can more easily find yourself.
If you end up looking for a book, I'll mostly recommend the Wonnacott brothers' Introductory Statistics, unless your field is dominated by Bayesians. Mostly because it (mostly) follows what I think is a pedagogically useful pathway. They start with basic probability and use those to build to what a distribution is, where it comes from, and how it works. They use distributions to get to sampling, and use sampling to get to confidence intervals, at which point you're reasoning statistically about the real world. From there, they get to measures of association and regression in a context where you're always assumed to be using sample data. Anyway, the book builds concepts in a way that seems useful to me for someone who intends to do statistical inference about the real world(s).
I want to do things in R.
The usual discipline-agnostic advice I give is that you should use and learn whatever the smart young things in your field (or what you think your field will be) are using.
Likewise, you should get comfortable with the techniques used in your field or future field. Most obviously there is clear divide between experimentally-oriented fields and regression-oriented fields, though of course both kinds of field include a little bit of the other.
I don't care about theory or proofs or solving equations or doing things with formulas (unless I really need to in order to understand things - do I?)
I will maybe differ a bit from PMDixon here and say that there's no particular reason you should need to go through high-level formal proofs. Like, you can get by perfectly well without ever having slogged through a proof of the central limit theorem.
BUT you should get comfortable working with formulas (or more usually accurate, algorithms), be able to manipulate them, and be able to see relationships between them. The usual example I offer is that you might see a formula for a difference of means test and a different formula for a difference of proportions test, but a difference of proportions test IS a difference of means test and you should be able to see how and why. If you're in a regression-oriented field, you should become comfortable with the manipulations required to understand contrasts (or sets of dummies) and interactive effects.
I want to know what various things mean and why I want to do them and what it means when I do them to data.
What they cover would vary from field to field, but this would normally be a two-semester sequence of courses.
I'd like to be able to do my own analyses and know why I am doing them and what they mean, and understand other people's analyses and why they did them and what they mean.
Just to note that this involves two very different things -- the statistical techniques used in your field or from related fields that could be brought to bear, but also research design. Research design is very much its own beast with its own joys and infuriating problems.
Do I need to learn linear algebra to do what I want to do?
It's helpful but not required. You certainly wouldn't ever need to do the equivalent of passing a course in linear algebra but lots of things in statistics are simpler in a matrix world. Believe me when I tell you that getting enough to muddle through as a practitioner is not that bad.
posted by GCU Sweet and Full of Grace at 8:08 PM on January 4, 2018 [5 favorites]
If you end up looking for a book, I'll mostly recommend the Wonnacott brothers' Introductory Statistics, unless your field is dominated by Bayesians. Mostly because it (mostly) follows what I think is a pedagogically useful pathway. They start with basic probability and use those to build to what a distribution is, where it comes from, and how it works. They use distributions to get to sampling, and use sampling to get to confidence intervals, at which point you're reasoning statistically about the real world. From there, they get to measures of association and regression in a context where you're always assumed to be using sample data. Anyway, the book builds concepts in a way that seems useful to me for someone who intends to do statistical inference about the real world(s).
I want to do things in R.
The usual discipline-agnostic advice I give is that you should use and learn whatever the smart young things in your field (or what you think your field will be) are using.
Likewise, you should get comfortable with the techniques used in your field or future field. Most obviously there is clear divide between experimentally-oriented fields and regression-oriented fields, though of course both kinds of field include a little bit of the other.
I don't care about theory or proofs or solving equations or doing things with formulas (unless I really need to in order to understand things - do I?)
I will maybe differ a bit from PMDixon here and say that there's no particular reason you should need to go through high-level formal proofs. Like, you can get by perfectly well without ever having slogged through a proof of the central limit theorem.
BUT you should get comfortable working with formulas (or more usually accurate, algorithms), be able to manipulate them, and be able to see relationships between them. The usual example I offer is that you might see a formula for a difference of means test and a different formula for a difference of proportions test, but a difference of proportions test IS a difference of means test and you should be able to see how and why. If you're in a regression-oriented field, you should become comfortable with the manipulations required to understand contrasts (or sets of dummies) and interactive effects.
I want to know what various things mean and why I want to do them and what it means when I do them to data.
What they cover would vary from field to field, but this would normally be a two-semester sequence of courses.
I'd like to be able to do my own analyses and know why I am doing them and what they mean, and understand other people's analyses and why they did them and what they mean.
Just to note that this involves two very different things -- the statistical techniques used in your field or from related fields that could be brought to bear, but also research design. Research design is very much its own beast with its own joys and infuriating problems.
Do I need to learn linear algebra to do what I want to do?
It's helpful but not required. You certainly wouldn't ever need to do the equivalent of passing a course in linear algebra but lots of things in statistics are simpler in a matrix world. Believe me when I tell you that getting enough to muddle through as a practitioner is not that bad.
posted by GCU Sweet and Full of Grace at 8:08 PM on January 4, 2018 [5 favorites]
My stats proof gave me a copy of Statistics for the Terrified, and I found it very helpful. I still refer to it every once in a while.
posted by The Underpants Monster at 12:45 AM on January 5, 2018 [1 favorite]
posted by The Underpants Monster at 12:45 AM on January 5, 2018 [1 favorite]
I'm bad at math. The only reason I learned the foundation of statistics in college was Statistics for Math Haters by Elijah Lovejoy. Sometimes Amazon has used copies.
In high school and college math courses the primary purpose of teachers and authors seemed to be to spread the student's grades on a bell shaped curve. If you weren't near the top of the curve, then math obviously wasn't for you. Statistics for Math Haters seemingly was written from the belief that most anyone could understand stats if that was the primary teaching objective. My statistical foundation served me well in my Ph.D. program and first career as a Research Analyst with considerable data analysis duties.
posted by Homer42 at 1:50 AM on January 5, 2018 [1 favorite]
In high school and college math courses the primary purpose of teachers and authors seemed to be to spread the student's grades on a bell shaped curve. If you weren't near the top of the curve, then math obviously wasn't for you. Statistics for Math Haters seemingly was written from the belief that most anyone could understand stats if that was the primary teaching objective. My statistical foundation served me well in my Ph.D. program and first career as a Research Analyst with considerable data analysis duties.
posted by Homer42 at 1:50 AM on January 5, 2018 [1 favorite]
There are some Future Learn courses that might be helpful: Data to Insight: An Introduction to Data Analysis; Introduction to R for Data Science; Health Data and Analytics. All are free. Disclaimer: I've done several FL courses but not these. I can't really speak to the issue of your motivation for online courses, but I will say that the FL courses I have done have been engaging and helpfully set out in small blocks, which helps with motivation.
posted by paduasoy at 4:07 AM on January 5, 2018 [1 favorite]
posted by paduasoy at 4:07 AM on January 5, 2018 [1 favorite]
Search Coursera for free, online statistics courses. I did an advanced stats course run by Johns Hopkins and found the course to be very helpful and relatively enjoyable.
posted by emd3737 at 6:40 AM on January 5, 2018
posted by emd3737 at 6:40 AM on January 5, 2018
Based on your interests, I think you might be better served and maybe less overwhelmed by an applied research methods course rather than a straight Statistics 101 type course. A good research methods course is going to spend a lot more time on understanding how to use (often somewhat basic) statistics to answer questions. I haven't taken this series of courses, but maybe something like this Methods and Statistics in Social Sciences Specialization would be a good fit? The subhead says the goal is to "Critically Analyze Research and Results Using R. Learn to recognize sloppy science, perform solid research and do appropriate data analysis."
posted by mjcon at 8:03 AM on January 5, 2018
posted by mjcon at 8:03 AM on January 5, 2018
tangentially, for linear algebra you may want to check 3Blue1Brown's youtube channel for a nice introduction/refresher. (h/t kliuless for posting this series)
posted by Dr. Twist at 8:30 AM on January 5, 2018
posted by Dr. Twist at 8:30 AM on January 5, 2018
Since you are focused on learning to use are to work with stats, it might be worth it to use something like Datacamp. I'm chugging through the Datacamp courses right now, and they are really good, though I haven't taken any of the stats courses specifically. They are offering a 50% off for the year deal right now, so it might be worth the 12 bucks a months just for the R-specific training.
posted by rockindata at 4:26 AM on January 6, 2018
posted by rockindata at 4:26 AM on January 6, 2018
This thread is closed to new comments.
I also did the Data Science course through Coursera a couple of years ago. I found the stats classes not particularly well taught but the rest of the series was a good grounding in R.
posted by mcduff at 6:39 PM on January 4, 2018 [1 favorite]