Software recommendations for epidemiology and/or population biology?
January 21, 2010 1:03 PM

What software have you used for research in epidemiology and/or population biology?

Pros and cons? Also, what have you modeled with it?

I'm looking to implement basic epidemic models to start with, but ideally would like to use a program that can also handle more complicated models down the road.

Thanks in advance for your input.
posted by inkyroom to Technology
My wife is an Epidemiologist and is constantly complaining about SAS. Apparently it's very powerful, and is really the only tool they know of to get the info they need, but it's a complete bear to work with.
posted by sanka at 2:01 PM on January 21, 2010

posted by jonesor at 2:19 PM on January 21, 2010

Sorry, I hit post too soon. Pros: it's free, open source, has hundreds of add-on packages and is extremely well supported thanks to it's multitude of users. I've used it to do everything from simple statistics to simulating many generations of gene flow on an individual-based level. It's extremely versatile
posted by jonesor at 2:22 PM on January 21, 2010

I don't know if it's sophisticated enough for what you have in mind down the road, but the CDC offers Epi Info software for free. It's extensible, and also has built in hooks to some other tools like GIS software.
posted by j-dawg at 2:23 PM on January 21, 2010

Most epi is not infection outbreak modeling, which is what I think that you want. For general epi data, I've used Stata, SAS, R, BUGS. For models of an outbreak, which are difference or differential equations, which I've used Matlab and Mathematica.
posted by a robot made out of meat at 2:24 PM on January 21, 2010

Oh yeah.
Stata has an accessible menu driven interface, which also shows you the command line for the same thing. It has excellent extensive documentation which details how to use something with examples, as well as books which give the details of what it's doing. I have gotten doctors to successfully use it, which is saying something. Stata has a programming language which I have never learned, but have heard is not too hard.

SAS is very powerful, very fast, and has unmatched handling of extremely large datasets. If you are AMEX and want to do analysis on billions of purchases or dealing with the complete national birth certificate data, you want SAS. It has a macro language which I have heard less fun things about. Documentation is also good, but maybe not as copious as Stata.

R is free. Because the language that it comes with is pretty easy, there are many useful extensions. Doing what you want may not be intuitive or easy, and finding it out might not be either. Once you learn how to do something, no problem.

BUGS is software for gibbs sampling, which it's rare that you want in epi, but some times you do. It kinda punches you in the face.

Matlab is numerical methods and simulation with an easy language. Very popular in engineering.
posted by a robot made out of meat at 2:33 PM on January 21, 2010

I like R, too. It's statistical software which is what I assume you are looking for. It's primarily command line and I found the learning curve to be a bit steep, but it's free and there are several free manuals to get up and running. There are numerous user-created packages for it. I use it primarily for statistical ecology but here is a link to somebody using it for epidemic modeling.

For a simpler to use program, I recommend STELLA. It is a modeling program that I used a decade ago with climatology-related modeling, but it may also fit the bill for what you are doing.
posted by surfgator at 4:17 PM on January 21, 2010

