Best online venue for taking a free or cheep course in R
April 23, 2020 12:59 PM   Subscribe

I've had some casual exposure to R. I would like to become sufficiently proficient that I could say install R and packages, import a data set, run some statistical models, and make some graphs. I'd like to do a course online. I'd like something free or quite cheap. I do not require any sort of certification or grade etc. What's the best source/site for such a course? I'm open to everything from to MOOCs etc. I don't think i really want to cobble something together by browsing youtube.
posted by If only I had a penguin... to Education (13 answers total) 32 users marked this as a favorite
The book R for Data Science, by Hadley Wickham, is available free online. The text mostly consists of code examples with explanations. The idea is to copy-and-paste the code into your own RStudio as you go, so it's fairly interactive. This is not a video course of the kind you would get from a MOOC, but the material is exactly what you want.

Hadley Wickham is a very influential R programmer who advocates a certain programming style and set of tools. These are what the book teaches. They are quite popular, for good reason, and many would argue they are the best place for a beginner to start, but there are other programming styles, so it's worth being aware of that.
posted by vogon_poet at 1:06 PM on April 23, 2020 [10 favorites]

I suppose it depends on what you consider "cheap."

Data Camp is roughly $25 a year now.

Coursera is roughly $50 per course per month.

Both of those are cheap to me and the certifications look impressive.
posted by Young Kullervo at 1:11 PM on April 23, 2020

I took a Johns Hopkins MOOC years ago through coursera that was a great intro to the language.
They have since transformed it into a whole sequence of classes:
It looks like it is still free as long as you don't need a certificate of completion. (see the last question in their FAQ)
posted by stobor at 1:19 PM on April 23, 2020

Pluralsight's content has been good (in areas I am familiar with); do they have anything for R? I know that they're free for anyone in April.

(ObDisc: No beneficial relationship, just a happy customer.)
posted by wenestvedt at 1:38 PM on April 23, 2020 [1 favorite]

I teach R to my biology and environmental science students. Here are a couple of free resources that I have found useful.
1) Passion Driven Statistics is a free e-book to teach statistics. She includes information for several different statistical languages/programs. The R information is outstanding. In addition to the print chapters which include great templates for running stats and making graphs, each chapter also has videos (one for each language/program) to show you how to actually do it. My students, who really prefer videos for everything, really like this one. It includes a great intro to ggplot2, the most commonly used graphing package.
2) Swirl is a package within R to teach you basic R programming. You have a lot of choices about the different tools to learn. The last time I checked, it didn't include ggplot2, which is too bad, but the statistical and just general R basics are great, and it's nice to have your code checked as you go along.
posted by hydropsyche at 1:53 PM on April 23, 2020 [6 favorites]

Unless you really love video lectures, you're unlikely to find a paid online course better than the Hadley Wickham book vogon_poet linked. Make sure you do (or at least attempt) all the exercises! (If you're stumped on any of them, someone has put all the solutions here.)
posted by theodolite at 1:53 PM on April 23, 2020

Seconding R For Data Science! Even if you never use tidyverse again, it's a good way to practice thinking about how data is organized and presented.

I've used Data Camp for occasional lessons and liked it just fine. But yeah, start with Hadley, and then just go wild playing with datasets.
posted by kalimac at 1:54 PM on April 23, 2020 [1 favorite]

Go here.

Sign up for a free Visual Studio dev account.
Then you can get 2 months of free datacamp membership.

That should be plenty time to decide if it's the right route for you.
posted by Just this guy, y'know at 3:15 PM on April 23, 2020

I am a statistician by profession and have been an R user for 12 years, so since the pre-tidyverse days. I’m a bit of a purist and not a huge fan of the tidyverse. It has its uses but I think it is helpful to understand R from the ground up as well. Tidyverse as a tool risks making it easy to do data science without understanding the models you might be using, so my answers come from that angle.

That said the bit of the tidyverse I do like is ggplot for statistical graphics which I think is quite brilliant.

For import/export and data manipulation in base R as well as the first steps in statistical inference, I love Statistics: An introduction using R and its bigger sister The R Book, there are a bit older now, but will hold up as R is (mostly) backwards-compatible.

For more advanced statistical techniques along the lines of “run some statistical models” I like Introduction to Statistical Learning with Applications Using R which also has a big sister Elements of Statistical Learning. The former has an associated, quite brilliant and funny MOOC which I am always recommending: Statistical Learning.

For getting to grips with contemporary R (tidyverse and ggplot2) you could do worse than R in 24 hours. It was coauthored by the team at Mango Solutions (they don’t seem to offer online trainings yet but they probably will soon enough) and is quite good for the programming side of things if you want to get started quickly. It was written in 2015 though and the tidyverse has moved fast in the interim so parts of it might be a bit out of date.
posted by Erinaceus europaeus at 7:56 PM on April 23, 2020 [1 favorite]

Response by poster: hmm...ok, so I think I might use a mix of hadley and datacamp. I looked at the first little bit of both. Also, based on my previous exposure you have to install various packages to run models, right? I need to learn to do that, too.

But....both seem to start post-installation and post-data opening. Also, both seem to be "tidyverse" based and I don't even know what that is. My exposure to R was via some other was like a launcher app and could also launch python and jupyter (?) notebooks and some other things. Is that(whatever it was) no longer a thing?

My statistical knowledge is intermediate at least (multiple graduate level courses and years of experience in use), so I'm really looking to learn R more than to learn stats.
posted by If only I had a penguin... at 8:48 PM on April 23, 2020

R is an open source statistical programming language consisting of base R (the “core” of it) and then a series of packages, some of them developed by the R Core Team and others developed by R users. A collection of R packages (available from a repository known as CRAN) are easy to install from within R and go through a series of rudimentary quality checks before being made available this way; these days package authors also put R packages on github but these do not go through CRAN’s quality checking procedures, making them rather more caveat emptor.

The tidyverse is a growing collection of interrelated packages for doing data science operations, started by Hadley Wickham. Tidyverse packages play nicely with each other so data imported and cleaned using tidyverse operations end up formatted in such a way that the plotting functions can be called without further data wrangling. In recent years at the same time as Python’s data analytical functions getting more advanced R has gone the other way and become better and better at data manipulation tasks thanks to the tidyverse.

It sounds as if you have previously used R via something like an IDE (with notebooks and so on). R itself comes with its own GUI type interface; these days a popular interface for using R (which might have been the one you used?) is R Studio which also allows you to interface to Python, Jupyter and so on as well as to R if you so choose and comes with a package manager to make downloading, installing and loading (and writing) R packages easier.

Even if you know statistics, the Crawley book or Statistical Learning are good for learning the R language for statistical operations. From memory the introduction covers installing R, packages, simple data import/export, how to fit and check models and so on even if you don’t need instructing in what the models actually do. The MOOC I mentioned uses R Studio. Some of the R syntax can be idiosyncratic, it’s much less of an intuitive programming language than Python; the tidyverse aims to improve this.
posted by Erinaceus europaeus at 9:53 PM on April 23, 2020 [4 favorites]

Erinaceus europaeus has a fantastic explanation*. You may have come to the IDE from the opposite end -- I think Jupyter notebooks has the ability to run R. (It does in Sagemaker, at least.) I vastly prefer RStudio.

To get up to speed for the Wickham book, download and install R and R Studio; instructions should be on their webpages, and it's quite easy to do. Open RStudio; you'll see a screen with 3 frames. Go to File -> New File -> R Script, and a fourth frame/panel will appear in the upper left-hand quadrant. This is where you'll do the work. He touches on how to install and call packages, but it's really quite easy. To install, go to the Console (lower left quadrant) and type:

install.packages('NAME OF PACKAGE')

(Helpful little suggestions will pop up as you type the name of the function, in this case install.packages.)

Once that's done, you need to load the package with:

library(NAME OF PACKAGE) [note the lack of quotes!]

I usually do this in the notebook or script, since you need to load the library every time you reopen R.

Really, I'd go with what Erinaceus europaeus suggests, but that should get you started on the basics. I've taught very basic R to some (non-programmer) co-workers, so please let me know if you want a quick walk-through or something, it's no trouble to hop on Zoom or take a bunch of screenshots or something.

*Incidentally, my stats knowledge eternally needs some help so I'm probably going to do the Statistical Learning MOOC, so thank you for that!
posted by kalimac at 7:29 AM on April 24, 2020 [2 favorites]

Response by poster: thanks all...RStudio is definitely what I used before.
posted by If only I had a penguin... at 3:49 PM on April 24, 2020

« Older Can a "breathable" waterproof jacket be made to...   |   Canine separation anxiety and working from home Newer »
This thread is closed to new comments.