Trapped in a local min.
November 2, 2012 3:11 PM Subscribe
I sort of know Matlab, R, SPSS, and C. But how do I really learn Matlab, R, SPSS, and C?
I've been scripting math/stat things for years in the context of both work and school. I've been using Matlab and SPSS for about three years. I can duplicate in R most of what I can do in SPSS. I know enough C to do problems from K&R and mock up numerical algorithms for analysis courses. Basically, I know enough to meet my needs and can look up answers to most questions I have.
The problem is that my needs are very modest, and when I look for more in-depth learning materials, I run into my lack of area expertise.
For example, the MathWorks website has lots of free recorded webinars, but they're on topics like large-scale wind farm management, circuit simulation, fluid dynamics modeling, etc. Those things are far outside my area of expertise. I am not an engineer. I don't know what the problems are in these areas, and learning to programs solutions to problems I don't understand seems backwards. (For the record, my niche is MRI, but we leave the writing of analysis software to engineers and computer scientists at NIH.)
Similarly, I am not a PhD statistician. I have 12 graduate credits' worth of statistics. I am familiar with a handful of basic statistical models and when they are best applied. This requires minimal mastery of SPSS or R. A statistician should be able to develop hybrid models to accommodate any sort of messy data from the real world, rather that complete, balanced research datasets.
I know enough C to do trivial problems and mock up simple numerical algorithms. That's not what C is really used for. But I certainly can't write a new LAPACK, and I know next to nothing about OS design principles or algorithm optimization.
How and in what directions can I make progress from my current state? How can I take my skills to the point where, for example, they might be used as a central part of a job? Or, do I absolutely need to be an expert in a domain of application before it pays off to improve software skills?
(If you are going to suggest "contributing to projects," please outline the next 2-3 steps. I have no idea what kinds of projects use these tools, or how to find ones I can contribute to meaningfully.)
I've been scripting math/stat things for years in the context of both work and school. I've been using Matlab and SPSS for about three years. I can duplicate in R most of what I can do in SPSS. I know enough C to do problems from K&R and mock up numerical algorithms for analysis courses. Basically, I know enough to meet my needs and can look up answers to most questions I have.
The problem is that my needs are very modest, and when I look for more in-depth learning materials, I run into my lack of area expertise.
For example, the MathWorks website has lots of free recorded webinars, but they're on topics like large-scale wind farm management, circuit simulation, fluid dynamics modeling, etc. Those things are far outside my area of expertise. I am not an engineer. I don't know what the problems are in these areas, and learning to programs solutions to problems I don't understand seems backwards. (For the record, my niche is MRI, but we leave the writing of analysis software to engineers and computer scientists at NIH.)
Similarly, I am not a PhD statistician. I have 12 graduate credits' worth of statistics. I am familiar with a handful of basic statistical models and when they are best applied. This requires minimal mastery of SPSS or R. A statistician should be able to develop hybrid models to accommodate any sort of messy data from the real world, rather that complete, balanced research datasets.
I know enough C to do trivial problems and mock up simple numerical algorithms. That's not what C is really used for. But I certainly can't write a new LAPACK, and I know next to nothing about OS design principles or algorithm optimization.
How and in what directions can I make progress from my current state? How can I take my skills to the point where, for example, they might be used as a central part of a job? Or, do I absolutely need to be an expert in a domain of application before it pays off to improve software skills?
(If you are going to suggest "contributing to projects," please outline the next 2-3 steps. I have no idea what kinds of projects use these tools, or how to find ones I can contribute to meaningfully.)
Where are you trying to end up? It's not clear to me why you want to improve your programming skills, or what you would want to apply them to once you've improved them. If you want to be a full-time programmer on products, that's actually a somewhat different set of skills from being, say, a toolsmith working with specific projects or researchers.
posted by hattifattener at 3:34 PM on November 2, 2012
posted by hattifattener at 3:34 PM on November 2, 2012
Response by poster: Where are you trying to end up?
I like the feeling of making progress. With all these things there's an initial learning process and then I come to a plateau. Identifying further learning goals is a big part of my question.
posted by Nomyte at 3:44 PM on November 2, 2012
I like the feeling of making progress. With all these things there's an initial learning process and then I come to a plateau. Identifying further learning goals is a big part of my question.
posted by Nomyte at 3:44 PM on November 2, 2012
You could try Rosalind or Project Euler. There's a nice sense of progress to being able to solve a given problem and move on. I haven't done Rosalind, but on Euler at least once you solve a problem there's a thread where people compare their solutions, and reading and understanding other peoples' solutions can be an eye-opener to techniques you didn't think of.
They probably start below your current knowledge level (which is good) and do become pretty difficult, although they are all still "small" problems— you won't learn large-scale or long-duration programming skills there.
posted by hattifattener at 4:03 PM on November 2, 2012 [2 favorites]
They probably start below your current knowledge level (which is good) and do become pretty difficult, although they are all still "small" problems— you won't learn large-scale or long-duration programming skills there.
posted by hattifattener at 4:03 PM on November 2, 2012 [2 favorites]
You might try joining the competitions at Kaggle to practice your skills. I've found that its a great place to implement what I've learned, and to find out about models I haven't heard of. It'll especially help you learn "to develop hybrid models to accommodate any sort of messy data from the real world".
You can jump in headfirst and create a submission, or I've also learned alot from reading the benchmark code and the interviews with past winners.
posted by tinymegalo at 4:13 PM on November 2, 2012
You can jump in headfirst and create a submission, or I've also learned alot from reading the benchmark code and the interviews with past winners.
posted by tinymegalo at 4:13 PM on November 2, 2012
Best answer: So, for a bit of Matlab experience that has to do with MRI, you could try doing the homework assignments for John Pauly's EE 469c class at Stanford.
The class is, more or less, on MRI image reconstruction. The idea of image recon is very simple- you do a 2d-fft on the raw data, and you have your image. But in practice, you need to deal with things like- data acquired in k-space along a non-Cartesian trajectory, or magnetic field inhomogeneities, or, that you only have half of the data so you need to synthesize the other half in a plausible way, and so forth-- so what was as simple as a single line of matlab code becomes a gradually escalating series of exercises that take you a step at a time from the platonic theory of the discipline to the engineering reality.
Also, one of the exercises involves reading one of Pauly's late 90's papers, and implementing it. Furthermore, many cutting-edge papers in image recon can still be implemented in a handful (less than 100 loc) of matlab code.
Also, my advice would be to stay away from C. It is just way too low level- the amount of work you would have to do before you could do anything actually relevant to current research is just way too high, or if you want to leverage already-written libraries, you may as well code in matlab to begin with. Or if you need the speed of C, you may as well prototype in matlab, and once you have demonstrated the proof-of-principle, translate it to C.
posted by Maxwell_Smart at 4:25 PM on November 2, 2012
The class is, more or less, on MRI image reconstruction. The idea of image recon is very simple- you do a 2d-fft on the raw data, and you have your image. But in practice, you need to deal with things like- data acquired in k-space along a non-Cartesian trajectory, or magnetic field inhomogeneities, or, that you only have half of the data so you need to synthesize the other half in a plausible way, and so forth-- so what was as simple as a single line of matlab code becomes a gradually escalating series of exercises that take you a step at a time from the platonic theory of the discipline to the engineering reality.
Also, one of the exercises involves reading one of Pauly's late 90's papers, and implementing it. Furthermore, many cutting-edge papers in image recon can still be implemented in a handful (less than 100 loc) of matlab code.
Also, my advice would be to stay away from C. It is just way too low level- the amount of work you would have to do before you could do anything actually relevant to current research is just way too high, or if you want to leverage already-written libraries, you may as well code in matlab to begin with. Or if you need the speed of C, you may as well prototype in matlab, and once you have demonstrated the proof-of-principle, translate it to C.
posted by Maxwell_Smart at 4:25 PM on November 2, 2012
Best answer: A sort of classic path to upping your game with C and OS internals is to re-implement significant chunks of Minix. Google for 'implement minix site:edu' to see many school assignments you can follow. Writing your own versions of malloc and free usually comes before that, so that may be a better first exercise. Again searching edu sites will turn up a gajillion explanations of the assignment. If those tasks don't sound right for you, then I agree with the suggestion to focus on the other stuff and leave C for another day.
posted by Monsieur Caution at 4:49 PM on November 2, 2012
posted by Monsieur Caution at 4:49 PM on November 2, 2012
One thing I have done to improve my programming skills in general, and Matlab programming specifically, is to take the problems I need to solve and then try and write code that solves that problem in a more general way, with error checking and a nice user interface. For instance one of my current projects involves doing linear unmixing of multiwavelength image stacks. The core code is a few lines of matlab, but I'm writing a bunch of objects to keep track of how the images were acquired, enforce that the reference spectra were acquired in the same way, and provide a nice interface to accessing the resulting data. This is partly because we've been bitten in the past by not having that kind of error checking and partly because it gives me a chance to push my skills at writing software objects that other people can use and deploy in their own code.
posted by pombe at 5:43 PM on November 2, 2012
posted by pombe at 5:43 PM on November 2, 2012
The best way I have found to learn a language/technology is to use it in a project. Watching webinars and reading books is nice, but I find I learn far more by trying to build something beyond the scope of my current knowledge. The questions you need to ask will come naturally as you run into problems, and at the end of the day you will have something cool that you built yourself.
posted by sophist at 8:54 PM on November 2, 2012
posted by sophist at 8:54 PM on November 2, 2012
I'll second Kaggle as an interesting way to go. Look up a fellow named Chris Raimondi who taught himself R in the context of rocking some Kaggle competitions. (I just heard him speak recently.)
posted by spbmp at 9:33 PM on November 2, 2012
posted by spbmp at 9:33 PM on November 2, 2012
Response by poster: I ended up working through problems on Rosalind. A lot of the time I feel like I'm reinventing a very small and primitive wheel, since most of the challenge comes from writing efficient algorithms rather than anything language-specific.
Now that the semester is finished and I don't have any homework to do, I'm starting Pauly's image reconstruction course. Sadly, the book for the course is now almost impossible to get, and I can only get it for a short while via interlibrary loan.
Kaggle is nice, but I need way more handholding. My background in data mining is basically nil.
posted by Nomyte at 10:08 PM on December 30, 2012
Now that the semester is finished and I don't have any homework to do, I'm starting Pauly's image reconstruction course. Sadly, the book for the course is now almost impossible to get, and I can only get it for a short while via interlibrary loan.
Kaggle is nice, but I need way more handholding. My background in data mining is basically nil.
posted by Nomyte at 10:08 PM on December 30, 2012
This thread is closed to new comments.
posted by Blazecock Pileon at 3:27 PM on November 2, 2012