Comments on: Difference between two correlated normally distributed random variables
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables/
Comments on Ask MetaFilter post Difference between two correlated normally distributed random variablesSun, 14 Sep 2008 22:06:03 -0800Sun, 14 Sep 2008 22:06:03 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60Question: Difference between two correlated normally distributed random variables
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables
A statistics question about dependent normal random variables. <br /><br /> My stats is really rusty so forgive me if I say something that doesn't make sense.<br>
<br>
I have two random variables, X1 and X2. Both are normally distributed; let's say mean1=7000, mean2=7500, and the standard deviations are both 15. I am interested in knowing the probability that X1 < X2. <br>
<br>
The trick is that these variables are not independent. They should be strongly correlated. That is, if X1 is really big, then X2 should also be pretty big. I don't know the exact correlation; maybe it's between 0.7 and 1.0. I plan to make a bunch of graphs leaving that as a variable. So let's assume for now that it's 0.8.<br>
<br>
If they were independant I would know what to do, but I can't figure out to handle the correlation. I've been mucking around in Matlab but all I can figure out how to do is make a multivariate normal distribution with all the parameters above; I don't know how to go from there to figuring out if X1 < X2. Help?post:ask.metafilter.com,2008:site.101706Sun, 14 Sep 2008 21:35:30 -0800PercussivePaulstatisticsmathmatlabBy: hattifattener
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476261
<a href="http://stattrek.com/AP-Statistics-3/Random-Variable-Combinations.aspx?Tutorial=AP">Combinations of random variables</a>? The probability that X1 <> 0.</>comment:ask.metafilter.com,2008:site.101706-1476261Sun, 14 Sep 2008 22:06:03 -0800hattifattenerBy: hattifattener
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476262
Er ... I meant to post, "The probability that X1 < X2 is the same as the probability that (-1*X1 + X2) > 0.".comment:ask.metafilter.com,2008:site.101706-1476262Sun, 14 Sep 2008 22:06:49 -0800hattifattenerBy: Eringatang
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476267
Easier to prove X2>=X1. Include the covariance between the two terms and that is a start.<br>
<br>
Ho: X2-X1 = 0<br>
Ha: X2-X1 not = 0<br>
<br>
Wald test: (X2-X1)/sqrt((varX1+varX2-2*CovX1X2))<br>
p-value from Z distribution<br>
<br>
Like everything, it has its limitations (and so do I) so proceed with caution. The Wald test is googleable.comment:ask.metafilter.com,2008:site.101706-1476267Sun, 14 Sep 2008 22:21:09 -0800EringatangBy: sesquipedalian
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476288
Do you believe that the two variables are multivariate normal? (Your question seems to imply so.) If so, then any linear transformation of a normal variable is also normal. In your case, X = (x1, x2) is a two dimensional random vector. If it's Normal(mu, Sigma), and B is a kx2 matrix, then BX is normal with mean B*mu and variance B^T Sigma B. <br>
<br>
Pick B = [ 1 -1 ], and you get the distribution of X1 - X2, which is what you want. Then the new variance is something like v_11 - 2*v_12 v_22, (where the v's refer to entries of the covariance matrix), assuming that I did the algebra right at this time of night. You can get the probability you want by using the Matlab function for the normal cdf (which I don't recall offhand).comment:ask.metafilter.com,2008:site.101706-1476288Sun, 14 Sep 2008 23:43:08 -0800sesquipedalianBy: PercussivePaul
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476295
I don't know if they're multivariate or not and I don't have the mental strength to wade through the equations on Wikipedia to figure out what that means. It was just the closest thing I could find in Matlab that seemed like it might work. As you might guess I am a little adrift when it comes to stats.<br>
<br>
Let me explain more about the variables. It's a communications problem. X1 and X2 represent the arrival times of two events. Nominally X1 is supposed to arrive 500 units before X2. If X1 happens to be late due to random noise or whatever, X2 should also be late, because the two processes that generate the events are subject to (roughly) the same noise. <br>
<br>
sesquipedalian, your answer feels right-ish to me, though of course I would like to check the algebra. That is, if they're multivariate. (are they?)comment:ask.metafilter.com,2008:site.101706-1476295Mon, 15 Sep 2008 00:09:04 -0800PercussivePaulBy: PercussivePaul
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476296
By the way, I should point out that this is a theoretical exercise, not one based on measurements. The mean, stdev, and distributions are my best guess as to what might occur in a real system and I want to know what the implications are.comment:ask.metafilter.com,2008:site.101706-1476296Mon, 15 Sep 2008 00:13:46 -0800PercussivePaulBy: Coventry
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476432
This question is easy to approach directly. The joint distribution (X1, X2) is a two-variable normal distribution. Draw a large sample of pairs from (X1, X2), and fit a normal distribution to the sample. (MATLAB must have a routine for this.) Then integrate the p.d.f. of that distribution over the set {X1 < X2}. (Again, MATLAB must have a routine to do this numerical integration. I don't use MATLAB, so I can't point you at the relevant routines.) This should give the same answer as sesquipedalian's so it's a good cross-check.comment:ask.metafilter.com,2008:site.101706-1476432Mon, 15 Sep 2008 06:33:29 -0800CoventryBy: a robot made out of meat
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476922
You write it like this: X2=rho*X1+N(m,s) where N(m,s) is a normal RV uncorrelated to X1 [the sum of two normals is a normal]. That is, as X1 increases the central tendency of X2 increases. Then <br>
<br>
P(X2>X1) = P( (rho-1)*X1 + N(m,s) >0)<br>
<br>
And (rho-1)*N(7000,15) is distributed as N((rho-1)*7000,(rho-1)*15)<br>
<br>
To get the X2 variance to equal 15, you have to require s^2 = 225*(1-rho^2)<br>
<br>
To get the second mean to equal 7500 you have to require m= 7000 (1-rho) +500<br>
<br>
What's rho? If you work through the covariance rho^2=corr(X1,X2)<br>
<br>
And (rho-1)*X1 + N(m,s) is distributed as N( 500, 15*sqrt(2-2*rho) )<br>
<br>
Now (PX2>X1) = P(N(500, 15*sqrt(2-2*rho) )>0)<br>
= P(N(0,15*sqrt(2-2*rho) )>-500)<br>
=P(N(0,1)>-500/15/sqrt(2-2*rho))<br>
<br>
Which you look up in a Z-table. For example, with rho = sqrt(0.8) the probability is so close to one as to be negligibly different (about 70 standard deviations).<br>
<br>
A difference in means of 33 standard deviations (500/15) is a ton. When the first one being big tends to make the bigger one even bigger the probability of the small one being larger is really really small.comment:ask.metafilter.com,2008:site.101706-1476922Mon, 15 Sep 2008 13:15:25 -0800a robot made out of meatBy: a robot made out of meat
http://ask.metafilter.com/101706/Difference-between-two-correlated-normally-distributed-random-variables#1476930
Oh, when you're playing with other parameters MATLAB probably has the normal (Gaussian) cumulative distribution (aka Phi) built in. Your answer will be Phi(delta_mean/sd/sqrt(2-2sqrt(corr)))comment:ask.metafilter.com,2008:site.101706-1476930Mon, 15 Sep 2008 13:21:30 -0800a robot made out of meat