January 13, 2014 7:22 PM Subscribe

I've recently become interested in machine learning and want to know some good resources for a beginner!

My background is in CS. I took a very general AI course in college as well as numerical analysis. Though I'm a bit rusty in math at the moment, I wouldn't be afraid to brush up with the proper motivation. I have a fair amount of programming experience and I'm reasonably fluent in most common languages so I wouldn't be opposed to picking up a new one for the right platform. I'm aware of scikit-learn but I am otherwise not very familiar with scipy-related tools. What would you recommend as the ideal way for a beginner to start diving into the field?

I'm also curious how one would go from a more general programming background (web development, etc.) to working in machine learning. Is this a field of computing that is possible to enter via experience, or do entry level positions require a specialized degree?

Thanks for your help!
posted by deathpanels to Computers & Internet (6 answers total) 24 users marked this as a favorite

My background is in CS. I took a very general AI course in college as well as numerical analysis. Though I'm a bit rusty in math at the moment, I wouldn't be afraid to brush up with the proper motivation. I have a fair amount of programming experience and I'm reasonably fluent in most common languages so I wouldn't be opposed to picking up a new one for the right platform. I'm aware of scikit-learn but I am otherwise not very familiar with scipy-related tools. What would you recommend as the ideal way for a beginner to start diving into the field?

I'm also curious how one would go from a more general programming background (web development, etc.) to working in machine learning. Is this a field of computing that is possible to enter via experience, or do entry level positions require a specialized degree?

Thanks for your help!

I took the Coursera course that pompomtom linked above, and it worked for me. The Neural Networks for Machine Learning course also looks interesting. I'm not sure if there's any way of getting the course materials (videos and exercises) other than waiting for the next scheduled run of the course though.

posted by russm at 7:52 PM on January 13

posted by russm at 7:52 PM on January 13

Probability theory and basic linear algebra are necessary background if you actually want to *learn* ML and not just memorize a "cookbook" of techniques. Perhaps it's my bias having just finished a math-heavy ML course (used the Bishop text), but it really seems like it's worthwhile.

Knowing the Bayesian versus frequentist interpretations of algorithms, or recognizing the correct distribution to use in a situation, or understanding the component steps in an EM algorithm, is going to place you way ahead of marginal programmers with a little experience running libsvm, etc. Why? Because computational power is finite. If not, ML would be easy. It seems to me like one needs to understand the process to know where corners can be cut, or solving a problem without using the "kitchen sink" approach.

Language-wise, my course was taught in Matlab, but now I'm teaching myself Julia-- dynamic programming & Matlab-like syntax with the speed of C. Matlab is great for prototyping, but it's slow. Python is faster (and handles text much better), but from what I've seen so far, Julia blows them both out of the water.

posted by supercres at 9:10 PM on January 13 [2 favorites]

Knowing the Bayesian versus frequentist interpretations of algorithms, or recognizing the correct distribution to use in a situation, or understanding the component steps in an EM algorithm, is going to place you way ahead of marginal programmers with a little experience running libsvm, etc. Why? Because computational power is finite. If not, ML would be easy. It seems to me like one needs to understand the process to know where corners can be cut, or solving a problem without using the "kitchen sink" approach.

Language-wise, my course was taught in Matlab, but now I'm teaching myself Julia-- dynamic programming & Matlab-like syntax with the speed of C. Matlab is great for prototyping, but it's slow. Python is faster (and handles text much better), but from what I've seen so far, Julia blows them both out of the water.

posted by supercres at 9:10 PM on January 13 [2 favorites]

And if you want to see what's involved in a math-heavy ML course, let me know; I can send you some of my assignments.

posted by supercres at 9:15 PM on January 13

posted by supercres at 9:15 PM on January 13

The coursera course pompomtom linked above is good. The hard version (math heavy) version of that course is on youtube. They're taught by the same professor, Andrew Ng. The Bishop text supercres mentioned is excellent.

I'm an academic, but my understanding is that data science type jobs aren't mature enough that they can ask for specific degrees yet. I see PhDs from some pretty diverse fields being hired for these types of jobs. (On the other hand, they have PhDs. I don't know what the qualifications are of people*without* advanced degrees, sorry.) Your best bet is probably to act like someone wanting to break into programming: work on some projects, maybe try some kaggle competitions, and get some experience and something you can point to as proof that you know what you're doing.

I think lots of people (most?) are using python right now. I use python and R (especially for graphing) but I don't think I would recommend you learn R (it can be a major headache to learn if you already know some "real" languages). I hate matlab with a fiery passion, but that's at least partly my own bias :)

posted by quaking fajita at 5:08 AM on January 14 [1 favorite]

I'm an academic, but my understanding is that data science type jobs aren't mature enough that they can ask for specific degrees yet. I see PhDs from some pretty diverse fields being hired for these types of jobs. (On the other hand, they have PhDs. I don't know what the qualifications are of people

I think lots of people (most?) are using python right now. I use python and R (especially for graphing) but I don't think I would recommend you learn R (it can be a major headache to learn if you already know some "real" languages). I hate matlab with a fiery passion, but that's at least partly my own bias :)

posted by quaking fajita at 5:08 AM on January 14 [1 favorite]

Just for funsies, here's the pre-course self-test from my class (Dropbox PDF). Nothing too involved, but for a lot of people taking the course, it's stuff they hadn't seen in years.

posted by supercres at 8:26 AM on January 14 [1 favorite]

posted by supercres at 8:26 AM on January 14 [1 favorite]

You are not logged in, either login or create an account to post comments

posted by pompomtom at 7:49 PM on January 13 [1 favorite]