<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Binomial distribution comparison</title>
	<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison/</link>
	<description>Comments on Ask MetaFilter post Binomial distribution comparison</description>
	<pubDate>Thu, 25 May 2006 16:48:26 -0800</pubDate>
	<lastBuildDate>Thu, 25 May 2006 16:48:26 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Binomial distribution comparison</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison</link>	
		<description>Probability of one binomially-distributed variable being greater than another one - is there a closed form for this? &lt;br /&gt;&lt;br /&gt; Specifically, the problem is this. (No, this isn&apos;t for homework)&lt;br&gt;
&lt;br&gt;
I&apos;ve got two random variables, X and Y, both drawn from binomial distributions. p is the same for each, but n differs; essentially, both X and Y are the number of heads that come up in a single instance of n coin flips if the coins are weighted such that there&apos;s a p probability of heads. X and Y are independent, and n can differ from one to the other (so call the two Ns n_x and n_y).&lt;br&gt;
&lt;br&gt;
Is there a way to get a closed form in terms of p, n_x, and n_y for Pr[X&#8805;Y]? I&apos;ve been banging my head against this for hours and I thought I&apos;d gotten somewhere but belatedly realized I&apos;d made a mistake. I&apos;m hoping some MeFi combanitorics whiz can help me out here.</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2006:site.38909</guid>
		<pubDate>Thu, 25 May 2006 15:54:16 -0800</pubDate>
		<dc:creator>wanderingmind</dc:creator>
		
			<category>binomial</category>
		
			<category>probability</category>
		
			<category>combanitorics</category>
		
			<category>math</category>
		
	</item> <item>
		<title>By: Galvatron</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601195</link>	
		<description>I don&apos;t have time to look hard at this right now, but it seems like the most straightforward approach is to condition on particular values of Y and use the law of total probability:&lt;br&gt;
&lt;br&gt;
\sum_{y=0}^{n_y} P(X \ge Y | Y = y) P(Y = y)&lt;br&gt;
&lt;br&gt;
Both of the probability terms in this expression can be computed in closed form.  It looks like you&apos;d get a double summation for a result, not sure if that can be simplified...&lt;br&gt;
&lt;br&gt;
I&apos;ll be back later if no one has addressed this in more detail.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601195</guid>
		<pubDate>Thu, 25 May 2006 16:48:26 -0800</pubDate>
		<dc:creator>Galvatron</dc:creator>
	</item><item>
		<title>By: Civil_Disobedient</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601200</link>	
		<description>&lt;i&gt;I don&apos;t have time to look hard at this right now...&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
Lemme guess, you found a truly marvelous answer for this, but don&apos;t have enough margin space to write it all down?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601200</guid>
		<pubDate>Thu, 25 May 2006 16:56:15 -0800</pubDate>
		<dc:creator>Civil_Disobedient</dc:creator>
	</item><item>
		<title>By: Wolfdog</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601225</link>	
		<description>I will welcome information to the contrary but I don&apos;t think you&apos;re going to get an answer as simple as you&apos;re hoping. The probability is a polynomial in p, of degree nx+ny. I mean, you&apos;ve no doubt discovered that for yourself already, and you know it&apos;s not hard to express as a double sum, but all those terms in the polynomial really matter - in fact, most of the action is in the middle coefficients. So the bigger nx and ny are, the more complicated the expression will be.&lt;br&gt;
&lt;br&gt;
If nx and ny are large you may be better off replacing it with a normal approximation.  Then look at the difference (Y-X) - that will be normal with -- well, you probably know what to do, but I&apos;ll elaborate if you want.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601225</guid>
		<pubDate>Thu, 25 May 2006 17:14:02 -0800</pubDate>
		<dc:creator>Wolfdog</dc:creator>
	</item><item>
		<title>By: Mr. Six</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601257</link>	
		<description>When you&apos;re comparing two distributions, you might think about looking at the probability distribution of their difference  P&lt;sub&gt;X-Y&lt;/sub&gt;.&lt;br&gt;
&lt;br&gt;
The difference of two random variables is itself a random variable and has its own probability distribution.&lt;br&gt;
&lt;br&gt;
If you have a large Nx and Ny, you can try to use a normal approximation to the binomial distribution. &lt;a href=&quot;http://mathworld.wolfram.com/NormalDifferenceDistribution.html&quot;&gt;Mathworld&lt;/a&gt; has listed probability distribution of the difference of two normal random variables.&lt;br&gt;
&lt;br&gt;
Using the binomial means (&amp;mu; = Np) and variances (&amp;sigma; = Np(1-p)) of the distributions of X and Y, you can estimate P&lt;sub&gt;X-Y&lt;/sub&gt;.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601257</guid>
		<pubDate>Thu, 25 May 2006 17:30:51 -0800</pubDate>
		<dc:creator>Mr. Six</dc:creator>
	</item><item>
		<title>By: hooves</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601551</link>	
		<description>I&apos;m celebrating my last chem lab of the year, so I don&apos;t (can&apos;t?) want to work out the details right now, but here is something to think about. BTW, since you are hoping for a &quot;combinatorics whiz,&quot; I&apos;ll assume that a normal approximation of the binomial is not acceptable. You probably want exact values; the fine line between discrete and continuous gets blurred in this post, however.&lt;br&gt;
&lt;br&gt;
So, we (wikipedia does too) know the cumulative distribution function (cdf) for a binomial random variable. &lt;br&gt;
&lt;br&gt;
For some constant k, we want the probability of Binomial Dist 1 &lt; k and if binomial dist 2&gt; k. This translates to:&lt;br&gt;
&lt;br&gt;
Z = cdf(B_1 @ k) * (1 - cdf2@k)&lt;br&gt;
&lt;br&gt;
We can do this without any covariance factors messing up the equation because X and Y are independent. &lt;br&gt;
&lt;br&gt;
Then, we want to know the cdf of that value over all possible values of k, 0 to infinity. So, we would integrate the above pdf (Z) from 0 to infinity. Unfortunately, I&apos;m not sure that k is rendered insignificant in this process. If k does disappear, great. If it doesn&apos;t, and you are still dying to know the answer to your question, I&apos;d be glad to run this one by my PSTAT prof, simply remind me to ask after the 3 day weekend.&lt;/&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601551</guid>
		<pubDate>Fri, 26 May 2006 00:35:32 -0800</pubDate>
		<dc:creator>hooves</dc:creator>
	</item><item>
		<title>By: Galvatron</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601555</link>	
		<description>So, how are you defining &quot;closed form?&quot;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601555</guid>
		<pubDate>Fri, 26 May 2006 00:44:55 -0800</pubDate>
		<dc:creator>Galvatron</dc:creator>
	</item><item>
		<title>By: louigi</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601808</link>	
		<description>It&apos;s not easy to get a nice expression, much easier to approximate. &lt;br&gt;
&lt;br&gt;
Let&apos;s suppose X=Bin(n,p) and Y=Bin(cn,p) for an integer c &amp;gt;= 1, and study P(X &amp;gt;= Y). We can express Y as the sum of c independent Bin(n,p) random variables. By symmetry, X is bigger than each of those with probability exactly 1/2, and if X is smaller than any of them then it is smaller than Y. &lt;br&gt;
&lt;br&gt;
Thus P(X &amp;gt;= Y) &lt; 2^{-c}.br&gt;
&lt;br&gt;
If p=a/n for some constant a, then you can get a lower bound of order e^{-ac} by bounding P(X &amp;gt;= Y) by the probability that  &lt;br&gt;
X&amp;gt;= 1 and Y=0, which is easy to explicitly express, and then using the approximation (1-t/n)^{un} ~ e^{-tu}.&lt;br&gt;
&lt;br&gt;
If p is larger, say p &amp;gt;= a*log n / n for some constant a, then you&apos;re best off using &lt;a href=&quot;http://en.wikipedia.org/wiki/Chernoff%27s_inequality&quot;&gt;Chernoff&apos;s Inequality&lt;/a&gt;.&lt;/&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601808</guid>
		<pubDate>Fri, 26 May 2006 07:32:12 -0800</pubDate>
		<dc:creator>louigi</dc:creator>
	</item><item>
		<title>By: Wolfdog</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601843</link>	
		<description>&lt;i&gt;We can express Y as the sum of c independent Bin(n,p) random variables. By symmetry, X is bigger than each of those with probability exactly 1/2&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
Your probability exactly 1/2 assertion is not correct.  If X1 and X2 are identically distributed, independent random variables then P(X1&amp;gt;=X2) is not 1/2.  P(X1 &amp;gt; X2) isn&apos;t 1/2 either.  That&apos;s the thing with discreteness - those pesky little points having mass.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601843</guid>
		<pubDate>Fri, 26 May 2006 08:09:20 -0800</pubDate>
		<dc:creator>Wolfdog</dc:creator>
	</item><item>
		<title>By: Wolfdog</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601862</link>	
		<description>I mean to say, &quot;If X1 and X2 are identically distributed, independent &lt;em&gt;binomial&lt;/em&gt; random variables&quot; there, of course.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601862</guid>
		<pubDate>Fri, 26 May 2006 08:24:38 -0800</pubDate>
		<dc:creator>Wolfdog</dc:creator>
	</item><item>
		<title>By: Wolfdog</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#601883</link>	
		<description>If you do the normal approximation, you get a closed-form answer in terms of Erf:&lt;br&gt;
&lt;tt&gt;&lt;br&gt;
P(X&amp;gt;=Y) = (1/2)(1 + Erf((nx*p - ny*p)/(Sqrt(2*(nx + ny)*(1 - p)*p]))&lt;br&gt;
&lt;/tt&gt;&lt;br&gt;
Of course, the normal approximation believes that the answer is exactly 1/2, independent of p, if nx=ny.  That is terribly wrong for small nx &amp;amp; ny&apos;s but nx and ny don&apos;t have to get too big before the approximation is excellent.  As usual, &quot;big&quot; is bigger is p is very close to 0 or 1, and not too big at all if p is closer to 1/2.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-601883</guid>
		<pubDate>Fri, 26 May 2006 08:38:03 -0800</pubDate>
		<dc:creator>Wolfdog</dc:creator>
	</item><item>
		<title>By: louigi</title>
		<link>http://ask.metafilter.com/38909/Binomial-distribution-comparison#602653</link>	
		<description>Sorry: I should have said P(X &amp;gt; Y) = P (Y &amp;gt; X).&lt;br&gt;
&lt;br&gt;
You can also get fairly easy upper bounds on P(X=Y) which will be pretty small (i.e. not that close to 1) unless p is much smaller than 1/n.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2006:site.38909-602653</guid>
		<pubDate>Sat, 27 May 2006 06:47:00 -0800</pubDate>
		<dc:creator>louigi</dc:creator>
	</item>
	</channel>
</rss>
