December 12, 2013 8:35 AM Subscribe

I'm a daily MATLAB user for data analysis, and fairly fluent with most toolboxes, including Parallel Computing. I know I need to learn something new, though.* (MATLAB is great for prototyping but unwieldy for real data-crunching.) I'm taking a class (Bayesian stat methods) starting in January based around **R**. What's the best resource to get started with R for someone like me?

That is, someone who:

* This sidesteps the question of whether R is the best thing to eventually land on for my goals. I don't think it is; Python (+numpy/scipy/nltk/scikit-learn) or maybe Julia would likely be better, but I**need** R for this class. I will probably be asking Metafilter this question soon.

Thanks!
posted by supercres to Computers & Internet (14 answers total) 23 users marked this as a favorite

That is, someone who:

- has a background in scientific computing/MATLAB programming
- has some background knowledge in statistics (basic stuff plus machine learning methods), but not a ton of theory (other than a course in probability)
- doesn't know other programming languages other than a smattering of Python (i.e., I mostly "think in MATLAB")
- would like to
*eventually*be able to do fairly fancy, high-throughput machine learning in R instead of MATLAB - Uses Macs exclusively but has access to a large Linux cluster (longshot that this would matter/be helpful)

* This sidesteps the question of whether R is the best thing to eventually land on for my goals. I don't think it is; Python (+numpy/scipy/nltk/scikit-learn) or maybe Julia would likely be better, but I

Thanks!

The Art of R Programming is currently sitting on my shelf, and was recommended to me as the best introduction to R by a couple different hard-core scientific R users (many of whom also started with MATLAB) in my office.

posted by rockindata at 9:05 AM on December 12, 2013

posted by rockindata at 9:05 AM on December 12, 2013

Oh, so my main advice is to just jump in and start tying to do problems in R. Allow twice as much time as would in matlab for the first few projects.

posted by shothotbot at 9:06 AM on December 12, 2013 [1 favorite]

posted by shothotbot at 9:06 AM on December 12, 2013 [1 favorite]

You're going to love R. Syntax is very similar to MATLAB and you should be able to do most of what you're used to with some combination of packages and muddling through. Seconding shothotbot, books might be somewhat useful, but the main thing you should do is just jump in.

posted by downing street memo at 9:11 AM on December 12, 2013

posted by downing street memo at 9:11 AM on December 12, 2013

Definitely use Rstudio.

Quick-R is a good place to find the basics. For example, here is a description of data types.

I use rseek when I'm looking for something specific--it filters out a lot of unrelated stuff that slips through google.

posted by esoterrica at 9:12 AM on December 12, 2013

Quick-R is a good place to find the basics. For example, here is a description of data types.

I use rseek when I'm looking for something specific--it filters out a lot of unrelated stuff that slips through google.

posted by esoterrica at 9:12 AM on December 12, 2013

R does't have a lot of fancy abstractions or types that you need to learn about and if you're fluent in Matlab you should pick R up just by using it and googling around a little. If you're taking a class then I wouldn't worry too much about preparing in advance. The instructor or other students should be able to direct you to the right packages to use. The hard part is knowing the statistics behind what you're doing.

Most R functions are very flexible and let you do much more than the "default" analysis, so I find it helpful to look at the documentation for a function even if I'm only using it in the "default" way or I already have example syntax.

The biggest day-to-day annoyance is that the syntax for getting say, the ij^th entry of a matrix is different. R uses square brackets where Matlab uses parentheses. Minor, but this is by far the biggest issue I have when I use both languages on the same day.

There are also a ton of resources for parallel computing in R. This seems to be an exhaustive listing but you're probably better off finding a tutorial. It's not something I've messed around with much so I don't have a specific reccomendation, but I can ask around if you like.

FWIW, most people I know doing large-scale data analysis are using Matlab (more statistics-y) or Python (more machine learning-y) because R code is just so slow and the tools for parallel computing have lagged (but are catching up now). <>

posted by matildatakesovertheworld at 9:16 AM on December 12, 2013 [1 favorite]

Most R functions are very flexible and let you do much more than the "default" analysis, so I find it helpful to look at the documentation for a function even if I'm only using it in the "default" way or I already have example syntax.

The biggest day-to-day annoyance is that the syntax for getting say, the ij^th entry of a matrix is different. R uses square brackets where Matlab uses parentheses. Minor, but this is by far the biggest issue I have when I use both languages on the same day.

There are also a ton of resources for parallel computing in R. This seems to be an exhaustive listing but you're probably better off finding a tutorial. It's not something I've messed around with much so I don't have a specific reccomendation, but I can ask around if you like.

FWIW, most people I know doing large-scale data analysis are using Matlab (more statistics-y) or Python (more machine learning-y) because R code is just so slow and the tools for parallel computing have lagged (but are catching up now). <>

posted by matildatakesovertheworld at 9:16 AM on December 12, 2013 [1 favorite]

Honestly, the hardest thing about making this switch might be R's badly inconsistent naming conventions.

There's a reason why the resources you can find are mostly function reference cheat sheets. For the most part you can "think in MATLAB" and type in R and it will work okay. The conceptual stuff that some R newcomers trip over — e.g. "What's a vectorized function?" — won't be difficult for you. A lot of the idioms translate over just fine. But you will forever be scratching your head trying to remember whether it's`as.matrix(...)` or `asMatrix(...)` or `as.Matrix(...)` or what.

Autocomplete in Rstudio will be helpful for this too, but doesn't completely solve the problem as you still need to know the first few characters of the function or argument name.

posted by Now there are two. There are two _______. at 9:21 AM on December 12, 2013

There's a reason why the resources you can find are mostly function reference cheat sheets. For the most part you can "think in MATLAB" and type in R and it will work okay. The conceptual stuff that some R newcomers trip over — e.g. "What's a vectorized function?" — won't be difficult for you. A lot of the idioms translate over just fine. But you will forever be scratching your head trying to remember whether it's

Autocomplete in Rstudio will be helpful for this too, but doesn't completely solve the problem as you still need to know the first few characters of the function or argument name.

posted by Now there are two. There are two _______. at 9:21 AM on December 12, 2013

(Another thing you'll need to get used to is that MATLAB has a bunch of syntactic sugar for constructing vectors and matrices — all the stuff you do with square brackets and colons and semicolons — but R just has plain ol' functions here with normal function syntax: a concatenation function for making vectors, a sequence function for generating integer sequences, and so on. Again, that's not really a conceptual shift — just a different style of notation that you need to adjust to. But so a book isn't really going to help with that. You just need to cultivate different habits for reading and writing code.)

posted by Now there are two. There are two _______. at 9:29 AM on December 12, 2013

posted by Now there are two. There are two _______. at 9:29 AM on December 12, 2013

This is what Metafilter is great for over, say, StackExchange. Really my question should have been, "What sort of shit is going to trip me up going from MATLAB to R?" Sometimes answering the question I should've asked is much more useful than the question I explicitly asked.

Sounds like jumping right in is the way to go. Thanks, all!

(Any other thoughts about resources geared in this direction would be greatly appreciated as well.)

posted by supercres at 9:37 AM on December 12, 2013

Sounds like jumping right in is the way to go. Thanks, all!

(Any other thoughts about resources geared in this direction would be greatly appreciated as well.)

posted by supercres at 9:37 AM on December 12, 2013

I posted a similar question previously and got a ton of helpful responses! Godspeed! :)

posted by Keter at 9:37 AM on December 12, 2013

posted by Keter at 9:37 AM on December 12, 2013

2nd Quick-R

I found a lot of useful stuff at this blog (which I found through r-bloggers)

This is a site with a lot of helpful 2-minute videos

posted by Asparagus at 9:57 AM on December 12, 2013

I found a lot of useful stuff at this blog (which I found through r-bloggers)

This is a site with a lot of helpful 2-minute videos

posted by Asparagus at 9:57 AM on December 12, 2013

I learned a lot by using the R Instructor app (there's a version for android too). It's a very handy reference.

posted by dialetheia at 10:32 AM on December 12, 2013

posted by dialetheia at 10:32 AM on December 12, 2013

Other people mentioned a bunch of useful things so I'll just add to check out R for MATLAB users, which is a cheatsheet geared towards people with your background.

posted by en forme de poire at 5:13 PM on December 12, 2013

posted by en forme de poire at 5:13 PM on December 12, 2013

I have to recommend the R inferno, as there are a fair number of gotchas and odd design decisions in the r language that the uninitiated tend to get stuck on. That's got like 95% of the ones that have hit me over the years.

posted by Ms Vegetable at 8:17 PM on December 12, 2013

posted by Ms Vegetable at 8:17 PM on December 12, 2013

You are not logged in, either login or create an account to post comments

There is a great graphing package, ggplot2, which takes a lot to get your head around. I found this site to be good in trying to get ggplot to do more or less what I want.

Also, print out the r cheatsheet or maybe this one or that one.

Keep an eye on r-bloggers.

Needing a tool and being able to download it for free is a great part of R.

posted by shothotbot at 9:04 AM on December 12, 2013 [1 favorite]