Comments on: Correlation measurements of unequal vectors?
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors/
Comments on Ask MetaFilter post Correlation measurements of unequal vectors?Thu, 01 May 2008 21:45:12 -0800Thu, 01 May 2008 21:45:12 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60Question: Correlation measurements of unequal vectors?
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors
Can one perform Pearson's correlation or a variant with unequal numbers of rows? <br /><br /> Is there a variation of Pearson's correlation (or another correlation measurement) that I can use for two vectors X and Y, which have unequal numbers of rows? <br>
<br>
Likewise, if I have two sets of data (genomic sequences) that can be "centered", is it reasonable to throw away data at the "edges" which are in X but not in Y, if the edge data do not contribute greatly to the mean and variance? I see this option in R, for example, but am curious about the real-world side effects.post:ask.metafilter.com,2008:site.90336Thu, 01 May 2008 20:07:04 -0800Blazecock PileonmathstatisticspearsoncorrelationBy: bsdfish
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors#1326526
I think you need to define more carefully what you are looking for. Pearson's correlation is the covariance of two variables, divided by the individual standard deviations. Covariance is defined in terms of pairs, and doesn't really have a semantic meaning outside of that. <br>
<br>
IE, if you're looking for the correlation between height and weight, you'll measure a bunch of individual's heights and weights, and look at the relationship between the two. If you have a weight measurement without the corresponding height measurement, or vice versa, that's useless for determining correlations.comment:ask.metafilter.com,2008:site.90336-1326526Thu, 01 May 2008 21:45:12 -0800bsdfishBy: Crotalus
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors#1326544
No, because by definition there can be no "correlation" if data for one of the variables is missing for certain cases. There are a number of ways to estimate the values for missing data, and these are routinely employed in situations like yours.comment:ask.metafilter.com,2008:site.90336-1326544Thu, 01 May 2008 22:12:08 -0800CrotalusBy: Crotalus
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors#1326545
Oh, and to piggy back on bsdfish's response, if there is a non-random reason why certain people gave their weight but withheld their height, then the "real world side effects" would be a spurious relationship between the variables.comment:ask.metafilter.com,2008:site.90336-1326545Thu, 01 May 2008 22:14:25 -0800CrotalusBy: Blazecock Pileon
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors#1326548
<em>There are a number of ways to estimate the values for missing data, and these are routinely employed in situations like yours.</em><br>
<br>
What guides the decision to estimate or to truncate where null values exist in pairs? R's default is to omit ("truncate").comment:ask.metafilter.com,2008:site.90336-1326548Thu, 01 May 2008 22:19:39 -0800Blazecock PileonBy: Crotalus
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors#1326552
<em>What guides the decision to estimate or to truncate where null values exist in pairs? R's default is to omit ("truncate").</em><br>
<br>
Your judgement. Face validity. Do you have any reason to believe that there is a systematic reason why certain cases have missing data? My decision making process in your case would probably be along these lines: If I think an external reviewer is more likely to tank my article because of too many dropped cases than because I estimated missing data, then I'll estimate. Otherwise I won't. How's that for "real world"?comment:ask.metafilter.com,2008:site.90336-1326552Thu, 01 May 2008 22:31:53 -0800CrotalusBy: Blazecock Pileon
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors#1326555
That's about as real world as it gets. Thanks for the advice.comment:ask.metafilter.com,2008:site.90336-1326555Thu, 01 May 2008 22:42:25 -0800Blazecock PileonBy: a robot made out of meat
http://ask.metafilter.com/90336/Correlation-measurements-of-unequal-vectors#1326670
If you have just two variables, I'd be hard-pressed for a good reason to use imputation and estimate something for one.comment:ask.metafilter.com,2008:site.90336-1326670Fri, 02 May 2008 04:22:17 -0800a robot made out of meat