Calculating a correlation
July 9, 2017 9:03 AM Subscribe
Metafilter statisticians: I have a correlation coefficient (Pearson's r) reflecting the association between two continuous variables for N = 100 people. I also have a correlation coefficient for those same variables for a subsample of that sample (n = 40). Is it possible to calculate the value of r for the remaining people in the sample (n = 60) with this information?
Best answer: No. As a thought experiment, can we invent a 100-person sample with r=0 (no correlation) from two 40- and 60-person subsamples that each have r=1? Yes, this should be possible; on a plot of all 100 data points, we can shift the 60-person subset up or down until r=0 for the whole set. (This is true because if we shift the 60-person subset enough in one direction, then we can obviously make r positive for the whole 100-point set, and if we shift it enough in the other direction, we can obviously make r negative. Somewhere there is an intermediate position where r=0 for the whole set.)
But in the same fashion, we can also make a 100-person sample with r=0 from a 40-person subsample with r=1 and a 60-person subsample with r=–1. Thus, if we know that a 100-person sample has r=0 and a 40-person subsample has r=1, we cannot distinguish whether the remaining 60 people have r=1 or r=–1 (or something in between).
posted by aws17576 at 11:22 AM on July 9, 2017 [2 favorites]
But in the same fashion, we can also make a 100-person sample with r=0 from a 40-person subsample with r=1 and a 60-person subsample with r=–1. Thus, if we know that a 100-person sample has r=0 and a 40-person subsample has r=1, we cannot distinguish whether the remaining 60 people have r=1 or r=–1 (or something in between).
posted by aws17576 at 11:22 AM on July 9, 2017 [2 favorites]
Sorry, I edited that a couple of times and I still don't think it's clear. I was trying to substantiate my "no" answer with a counterexample. But it's definitely "no". :-)
posted by aws17576 at 11:25 AM on July 9, 2017
posted by aws17576 at 11:25 AM on July 9, 2017
Response by poster: Wow, yeah, okay. Clearly I wasn't fully awake when I asked this but I was about to email someone asking for these correlations and didn't want to ask if there was some way to calculate it on my own that I hadn't thought of. Thanks!
posted by quiet coyote at 11:28 AM on July 9, 2017
posted by quiet coyote at 11:28 AM on July 9, 2017
« Older Is it worth trying whole-body cryotherapy for back... | Reducing the Numbers, Increasing the Odds? Newer »
This thread is closed to new comments.
No, it doesn't.
posted by ROU_Xenophobe at 11:18 AM on July 9, 2017