Comments on: Bimodal or am I biased?
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased/
Comments on Ask MetaFilter post Bimodal or am I biased?Tue, 23 Sep 2008 20:04:33 -0800Tue, 23 Sep 2008 20:04:33 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60Question: Bimodal or am I biased?
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased
StatisticsFilter: how can I find out whether my data is bimodal? <br /><br /> I have no statistics skills whatsoever, and I would love to have this figured out before I return to campus tomorrow morning. Please help!<br>
<br>
<strong>Background</strong>: I have several sets of 70 numbers each (they represent the lengths of bacterial cells infected with different phages). I want to show that there is a significant difference between the two sets. While my averages look very good, adding error bars negates my findings. For example:<br>
Set 1: Average 6.83, Standard deviation 1.67.<br>
Set 2: Average 4.1, Standard deviation 1.00.<br>
<br>
<strong>Possible explanation</strong>: Let's look at one set at a time. There is a chance that I infected my bacteria with less phages than I planned to, so that not all the bacteria were affected--which would effectively split the populations sampled in each set into "infected" and "uninfected", and presumably they would have different length distributions. Can I test for this before I repeat my experiment (I plan to do that anyway, but still want to know if my findings are significant at this point)?<br>
<br>
Googling taught me that Hartigan's Dip Test is what I need. A stranger kindly posted <a href="http://www.nicprice.net/diptest/">Matlab functions for the test</a>.<br>
<br>
<strong>Problem</strong>: I have Matlab installed on my computer, but I have never used it and I have no idea what to do at this point. I have a column of values in Excel, and even if I manage to enter it as a set in Matlab (although my attempts so far don't look good), I don't know how to run the test. If you can show me how to run it, how would I interpret the results so that they would be meaningful to me (provided I get Matlab to print them)? Please, please, can you help me with that?<br>
<br>
Do you have any other suggestions for what to do to my data (remove outliers? how?) or how to look at it in order to see what's going on? (I can post the dataset somewhere if necessary.) Thanks!post:ask.metafilter.com,2008:site.102500Tue, 23 Sep 2008 19:53:08 -0800halogenstatisticsmatlabbimodaldatatortureBy: sergeant sandwich
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486230
i'm by no means a statistics expert, but if you want to find out if your data is bimodal, just plot a histogram? if it looks like <a href="http://en.wikipedia.org/wiki/Image:BimodalAnts.png">this</a>, it's bimodal. if there's just one hump, it's not. that dip test thing is an algorithmic way of doing what your eyes already do very well, and it seems like way too much extra work for what you want to accomplish.<br>
<br>
<a href="http://office.microsoft.com/en-us/excel/HP010983641033.aspx">histogram in excel</a><br>
<a href="http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/hist.html&http://www.google.com/search?hl=en&safe=off&client=safari&rls=en-us&q=matlab+histogram&btnG=Search">histogram in matlab</a><br>
<br>
if you're struggling with the computational aspect of it, i can't imagine it would take you longer than 10 minutes to count 70 data points into some bins by hand.comment:ask.metafilter.com,2008:site.102500-1486230Tue, 23 Sep 2008 20:04:33 -0800sergeant sandwichBy: kickingtheground
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486231
Have you actually plotted out and looked at your data? Histogram it in Excel, for instance? Does it <em>look</em> bimodal?comment:ask.metafilter.com,2008:site.102500-1486231Tue, 23 Sep 2008 20:04:40 -0800kickingthegroundBy: halogen
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486237
Whoa! How didn't I think of that?comment:ask.metafilter.com,2008:site.102500-1486237Tue, 23 Sep 2008 20:08:02 -0800halogenBy: milkrate
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486238
Significant difference in 2 sets = t-test<br>
<br>
t = (mean1 - mean2)/sqrt(var1/n1 + var2/n2)<br>
t = (6.83 - 4.1)/sqrt(1.67^2/70 + 1^2/70) = 2.73/0.23 = 11.73.<br>
<br>
11.73 is significant at whatever level you want.comment:ask.metafilter.com,2008:site.102500-1486238Tue, 23 Sep 2008 20:11:41 -0800milkrateBy: grouse
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486255
Asking whether the data are bimodal is not really the right question, because the already are already separated into two sets. What you want is to see whether there is a significant difference between the two sets, and the way to do that is with a <em>t</em>-test, as milkrate demonstrates.<br>
<br>
<a href="http://www.cs.uiowa.edu/~jcryer/JSMTalk2001.pdf">Do not use Excel for quantitative statistics,</a> although I presume it should be okay for a histogram.comment:ask.metafilter.com,2008:site.102500-1486255Tue, 23 Sep 2008 20:30:05 -0800grouseBy: halogen
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486261
Oh, yes, I already ran t-test and p looks great (3E23). Actually, while working on the histogram I suggested, I think I figured out my error bar problem: instead of entering standard deviation divided by square root of number of samples, I just entered the standard deviation. Suddenly things look a whole lot better!<br>
<br>
Thanks for the help--I suspected I wasn't approaching this correctly in the first place.<br>
<br>
<small>/me feels embarrassingly stupid, orders a Biostatistics book on Amazon.</small>comment:ask.metafilter.com,2008:site.102500-1486261Tue, 23 Sep 2008 20:40:57 -0800halogenBy: halogen
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486264
grouse, I meant to test whether the data are bimodal <i>within</i> each set, since some of the bacteria on each slide may not have been infected at all, and would have a different length distribution (closer to my control set) than the one that were.comment:ask.metafilter.com,2008:site.102500-1486264Tue, 23 Sep 2008 20:43:24 -0800halogenBy: blahblahblah
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486274
grouse - those complaints are about Excel 2000 - but they are still there in<a href="http://www.daheiser.info/excel/frontpage.html"> Excel 2007</a>.comment:ask.metafilter.com,2008:site.102500-1486274Tue, 23 Sep 2008 20:49:17 -0800blahblahblahBy: chrisamiller
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486275
don't know how much statistics you do, but if you plan to do this sort of thing on a regular basis, a little <a href="http://cran.r-project.org/">R</a> goes a long way.<br>
<br>
<em>/me feels embarrassingly stupid, orders a Biostatistics book on Amazon.</em><br>
<br>
No reason to feel stupid. I do this sort of number crunching all the time and still find it hard to wrap my head around some days. Statistics are hard and often non-intuitive, and there's no shame in asking for help.comment:ask.metafilter.com,2008:site.102500-1486275Tue, 23 Sep 2008 20:50:32 -0800chrisamillerBy: grouse
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486284
Oh, sorry about that, halogen; I should have read your question more carefully.<br>
<br>
In my work, I would use a histogram or density plot to try to find bimodality. Seems like you are already done, but if you really want to do a dip test, there is an <a href="http://cran.r-project.org/web/packages/diptest/index.html">R package</a> that allegedly does one.comment:ask.metafilter.com,2008:site.102500-1486284Tue, 23 Sep 2008 21:06:30 -0800grouseBy: halogen
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486304
I have, in the past, written my own simple scripts in python when it came to statistics, but thankfully, I don't have to do this sort of data crunching very often: I am more used to having to answer questions along the lines of "did I get a mutant or not?". I will definitely look into R--anything that keeps me from having to deal with Matlab, which is hideous on Linux anyway. Thanks everybody!comment:ask.metafilter.com,2008:site.102500-1486304Tue, 23 Sep 2008 21:25:16 -0800halogenBy: i_am_a_Jedi
http://ask.metafilter.com/102500/Bimodal-or-am-I-biased#1486494
You've probably already found it, but there are <a href="http://rpy.sourceforge.net/">python bindings</a> for R.comment:ask.metafilter.com,2008:site.102500-1486494Wed, 24 Sep 2008 05:38:29 -0800i_am_a_Jedi