What is the best way to learn R?
June 20, 2012 8:17 PM   Subscribe

What is the best way for me to learn R? In particular, what is the best website or online tutorials for learning to deal with large datasets.

I have a job where I will be expected to manipulate large amounts of data. I'm beginning to get sick of excel and I would love to start to make the jump to R.

I was wondering if there are any good tutorials or websites that deal with this? Especially sites with data that I can directly use. I have a fair amount of programming experience and I'm familiar with the use of statistics.
posted by aleatorictelevision to Computers & Internet (13 answers total) 84 users marked this as a favorite
It doesn't have sample datasets, but the Quick-R website is detailed and comprehensive without excessive hand-holding. The author covers both data manipulation and statistical analysis.
posted by Nomyte at 8:23 PM on June 20, 2012

Here are some actual R Manuals.

However, pandas is in python but offers the core functions of R; if you want more, there's a python package to use use r plugins directly from python. If you have programming experience, python may be more familiar and approachable than R on the whole.

As far as data sets, always start with the iris dataset. The other data sets on that site will do the trick as far as real-world complexity, though most of them aren't large, so if you just want to check on performance at scale you'll have to check elsewhere.

Keeping up to date on data use and doing stuff with it, Mark Bostock's twitter is a decent starting point. He developed d3, a data-oriented javascript library.

Any info about what kind of data you'll be manipulating?
posted by 23 at 8:30 PM on June 20, 2012 [1 favorite]

I recommend the R Cookbook as very useful, also you probably want this cheat sheet someplace handy.

I strongly recommend using RStudio as an environment, plain R is a little grim.

As always, the best way to learn is try to do something you already know how to do in R. So take one of your complicated spreadsheets and try to make a go of it.

One quirky thing about R is that it is all about the packages. Someone else has already done 90% of what you need to do. If you are going into time series check out xts. For some improved date and string functions check out lubridate and stringr. For nice graphics ggplot2. To slice and dice your data, plyr.

I keep an eye on r-bloggers which combines a bunch of different r blogs.
posted by shothotbot at 8:32 PM on June 20, 2012 [2 favorites]

Response by poster: I'm going to be working with time series data, actually the data from a sensor that is at the mouth of a bay.

Not quite sure about much beyond that, but I know I will have to somehow filter out the periodic tidal data.
posted by aleatorictelevision at 8:40 PM on June 20, 2012

R has a data class (ts) for time series and does really well with adjusting periodic data. Here's a tutorial.
posted by one_bean at 8:53 PM on June 20, 2012

You sure you can't use Matlab? There is a plethora of tools used for time series analysis of marine data that you may or may not be describing. T_tide is exactly what you want to use for filtering out the periodic tides.
posted by oceanjesse at 9:19 PM on June 20, 2012

Time Series Analysis and Its Applications: With R Examples.

(Not a personal recommendation - I haven't read it - but seems relevant.)
posted by 23 at 9:21 PM on June 20, 2012

Here is a big list of time series packages, but you should also poke around in the econometrics section.
posted by shothotbot at 10:19 PM on June 20, 2012

I do love R for some purposes, but if you're making the jump from Excel, I would highly recommend something simpler like Stata, which has time series applications built into the program and in general is just much more user friendly. Of course, it is not free like R, but it's not insanely expensive as a business expense.
posted by rainbowbrite at 7:04 AM on June 21, 2012

I wrote an intro to R class (for biologists) that is pretty basic...

You may also look around your local area, there's quite a few R users groups around now. Sometimes it's nice to have other people to ask in person or commiserate with. Many are listed here.
posted by mgogol at 7:22 AM on June 21, 2012 [1 favorite]

This question is nominally about R courses in NYC, but the answers there have several online and general reference links that I've found helpful.
posted by Signed Sealed Delivered at 7:24 AM on June 21, 2012

This is a great starting point and very up-to-date. Seconding the advice about ts.
posted by cromagnon at 12:30 PM on June 21, 2012

I program in R for a living (as an academic/biologist), work with a lot of time-series data, and also develop and maintain various R packages.

Although I generally do not recommend programming books (since the web has tons of tutorials and material does get dated fast), in your case I suggest you grab a copy of The Art of R Programming. The books is great (Norm is not a friend but a professional colleague) and well suited for someone with prior programming experience. If you're too poor or too cheap to buy a copy, here is an earier draft of the book from his website. I would not recommend reading through older tutorials or books (published 2008 or earlier) since so much has changed and old methods have been vastly improved on (especially for time-series and 'big data'). For example, there is a book by Michael Crawley which was great when it first came out but I cringe every time I see someone holding one.

Explore the CRAN task views. Think of it as a curated app list of sorts. Here's one for time-series analysis.

When looking at the CRAN page for a package (example), looks at the reverse depends list to get a sense of newer packages that might have functionaity that you're looking for. That's one easy way to discover new packages of interest.

If you have an RSS reader, then grab the feed for r-bloggers. It's an aggregator for R related blog posts from all kinds of people (academics like me, industry people etc). Although I mostly skim through posts, there are plenty of amazing tutorials, new package announcements etc that are extremely useful.

Once you're up and running and you have unanswered questions, head to Stack Overflow and browse through the R tag. It's a treasure trove of high quality answers and as a beginner, you will most likely find what you're looking for. If not, feel free to ask a question (but be sure to provide a clear, reproducible example).

Not sure where you live but look at meetup.com for a R-users group. I live in the bay area and we have a large one (200+ people). The meetings are usually tutorials or talks on new and interesting stuff in R.

Finally, here is a shameless self-link for a talk I gave to people in my department who were new to learning R. There might be something useful for you in there.

The best way is just to dive into R after a brief introduction to the basics. The learning curve is steep for some (maybe not so much for you since you have prior exposure to programming) but very rewarding.
posted by special-k at 10:06 AM on June 26, 2012 [3 favorites]

« Older Living in a box, living in a cardboard box   |   Senior citizen working off a ticket Newer »
This thread is closed to new comments.