Calculating a correlation
July 9, 2017 9:03 AM   Subscribe

Metafilter statisticians: I have a correlation coefficient (Pearson's r) reflecting the association between two continuous variables for N = 100 people. I also have a correlation coefficient for those same variables for a subsample of that sample (n = 40). Is it possible to calculate the value of r for the remaining people in the sample (n = 60) with this information?
posted by quiet coyote to Science & Nature (4 answers total)
 
Best answer: You mean that all you know is that the correlation for the full sample is 0.6 and the correlation for the subsample of 40 is 0.5. Does that tell you what the correlation in the subsample of 60 is?

No, it doesn't.
posted by ROU_Xenophobe at 11:18 AM on July 9, 2017


Best answer: No. As a thought experiment, can we invent a 100-person sample with r=0 (no correlation) from two 40- and 60-person subsamples that each have r=1? Yes, this should be possible; on a plot of all 100 data points, we can shift the 60-person subset up or down until r=0 for the whole set. (This is true because if we shift the 60-person subset enough in one direction, then we can obviously make r positive for the whole 100-point set, and if we shift it enough in the other direction, we can obviously make r negative. Somewhere there is an intermediate position where r=0 for the whole set.)

But in the same fashion, we can also make a 100-person sample with r=0 from a 40-person subsample with r=1 and a 60-person subsample with r=–1. Thus, if we know that a 100-person sample has r=0 and a 40-person subsample has r=1, we cannot distinguish whether the remaining 60 people have r=1 or r=–1 (or something in between).
posted by aws17576 at 11:22 AM on July 9, 2017 [2 favorites]


Sorry, I edited that a couple of times and I still don't think it's clear. I was trying to substantiate my "no" answer with a counterexample. But it's definitely "no". :-)
posted by aws17576 at 11:25 AM on July 9, 2017


Response by poster: Wow, yeah, okay. Clearly I wasn't fully awake when I asked this but I was about to email someone asking for these correlations and didn't want to ask if there was some way to calculate it on my own that I hadn't thought of. Thanks!
posted by quiet coyote at 11:28 AM on July 9, 2017


« Older Is it worth trying whole-body cryotherapy for back...   |   Reducing the Numbers, Increasing the Odds? Newer »
This thread is closed to new comments.