<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

      <title>Comments on: Statistics: tell me about r-square for nonlinear models</title>
      <link>http://ask.metafilter.com/65278/Statistics-tell-me-about-rsquare-for-nonlinear-models/</link>
      <description>Comments on Ask MetaFilter post Statistics: tell me about r-square for nonlinear models</description>
	  	  <pubDate>Thu, 21 Jun 2007 04:10:31 -0800</pubDate>
      <lastBuildDate>Thu, 21 Jun 2007 04:10:31 -0800</lastBuildDate>
      <language>en-us</language>
	  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
	  <ttl>60</ttl>

<item>
  	<title>Question: Statistics: tell me about r-square for nonlinear models</title>
  	<link>http://ask.metafilter.com/65278/Statistics-tell-me-about-rsquare-for-nonlinear-models</link>	
  	<description>StatisticsFilter: Can someone explain - with examples - why R^2 (r-square) values are not appropriate for use with non-linear regression models. I am told that the reason that r-square values are not recommended for non-linear regression is that they can mislead. R^2 for linear models are bounded between 0 and 1 and can be interpreted as the proportion of variance explained by the model. R^2 for non-linear models, I am told, can be outside these bounds and therefore cannot be interpreted in the same way.&lt;br&gt;
I&apos;d like to know how this exceeding the bounding thing works.  (I am using R if that makes giving examples easier). Many thanks!!</description>
  	<guid isPermaLink="false">post:ask.metafilter.com,2008:site.65278</guid>
  	<pubDate>Thu, 21 Jun 2007 02:36:24 -0800</pubDate>
  	<dc:creator>jonesor</dc:creator>
	
	<category>statistics</category>
	
	<category>r^2</category>
	
	<category>rsquared</category>
	
	<category>rsquare</category>
	
	<category>r-squared</category>
	
	<category>r-square</category>
	
	<category>math</category>
	
	<category>mathematics</category>
	
</item>
<item>
  	<title>By: singingfish</title>
  	<link>http://ask.metafilter.com/65278/Statistics-tell-me-about-rsquare-for-nonlinear-models#981216</link>	
  	<description>OK, for one thing the assumption for a linear regression is that the variance is approximately equal at every point along the scale (homoscedascity).  If I understand it correctly given the mathematical procedures used to compute non linear regression this is not possilbe so you get heteroscedascity.  I&apos;m no expert in non-linear regression, but I&apos;d expect that you have procedures in place to allow for some violations of assumptions but this &amp;quot;screwing around with the maths&amp;quot; doesn&apos;t help produce a valid estimate for r^2&lt;br&gt;
&lt;br&gt;
Nice simple example of non-linear regression is logistic regression - a special case of non-linear regression.  maybe you should have a look at that to help get your head around it a bit better.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.65278-981216</guid>
  	<pubDate>Thu, 21 Jun 2007 04:10:31 -0800</pubDate>
  	<dc:creator>singingfish</dc:creator>
</item>
<item>
  	<title>By: desjardins</title>
  	<link>http://ask.metafilter.com/65278/Statistics-tell-me-about-rsquare-for-nonlinear-models#981264</link>	
  	<description>&lt;blockquote&gt;If there is no linear relationship between the two variables, the correlation coefficient [r] is close to 0. [This] does not mean that there isn&apos;t any type of relationship between two variables; it is possible for two variables to have an r of 0 and be strongly related in a non-linear way.&lt;/blockquote&gt;&lt;br&gt;
&lt;br&gt;
SPSS 12.0 Guide to Data Analysis, Marija J. Norusis&lt;br&gt;
&lt;br&gt;
I just closed the damn book so I don&apos;t have the page # for the quote.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.65278-981264</guid>
  	<pubDate>Thu, 21 Jun 2007 05:31:38 -0800</pubDate>
  	<dc:creator>desjardins</dc:creator>
</item>
<item>
  	<title>By: ROU_Xenophobe</title>
  	<link>http://ask.metafilter.com/65278/Statistics-tell-me-about-rsquare-for-nonlinear-models#981423</link>	
  	<description>You mean R2 in models like logit or probit, then yeah.&lt;br&gt;
&lt;br&gt;
From dim memory and quick googling:&lt;br&gt;
&lt;br&gt;
R2 is 1- sum of squared residuals / sum of squared deviations, or 1-SSR/SSD.&lt;br&gt;
&lt;br&gt;
This works in OLS because the total sum of squares (SDD) is the regression sum of squares plus the sum of squared deviations, or Total SSD = regression SSD + SSR.&lt;br&gt;
&lt;br&gt;
But this identity doesn&apos;t hold in nonlinear models, so R2 gets fucked up.  Simple as that.&lt;br&gt;
&lt;br&gt;
Don&apos;t despair.  If you want a rough goodness of fit measure for an MLE model, the chi-2 statistic (the ratio of a null model&apos;s likelihood to the full model&apos;s likelihood) can serve similar to an OLS model&apos;s F statistic.  Or you can just use more direct measures of predictive success.  Or, MLE procedures in many statistical packages can generate pseudo-R2s that are varying degrees of crap.&lt;br&gt;
&lt;br&gt;
R2 is overrated anyway.  While the common interpretation is that it&apos;s the percent of variance that&apos;s explained (understandable given the identity), a more accurate interpretation is that it&apos;s how much more variation you&apos;re explaining than a null model with only a constant would explain.  That is, how much better is your model than a model that explains the DV with its own mean.  But there are two parts here -- how good your model is, and how good the DV&apos;s mean does at explaining the DV.  The better the mean does, the worse your R2 will look even though your model is good.&lt;br&gt;
&lt;br&gt;
Also, R2 always increases with added terms (at least in OLS).  Want a higher R2?  Add junk regressors.  Want an R2 of 1.00?  Have N-1 regressors (IIRC) and leave the dataset fully identified.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.65278-981423</guid>
  	<pubDate>Thu, 21 Jun 2007 08:27:08 -0800</pubDate>
  	<dc:creator>ROU_Xenophobe</dc:creator>
</item>
<item>
  	<title>By: jonesor</title>
  	<link>http://ask.metafilter.com/65278/Statistics-tell-me-about-rsquare-for-nonlinear-models#981560</link>	
  	<description>Thanks ROU! That&apos;s just what I wanted.&lt;br&gt;
I will use cross-validation instead of an R^2.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.65278-981560</guid>
  	<pubDate>Thu, 21 Jun 2007 10:18:37 -0800</pubDate>
  	<dc:creator>jonesor</dc:creator>
</item>
<item>
  	<title>By: math_junkie</title>
  	<link>http://ask.metafilter.com/65278/Statistics-tell-me-about-rsquare-for-nonlinear-models#1048957</link>	
  	<description>for non-linear models, the R2 can become negative - not exactly intuitive but nevertheless useless. Another possibility is to examine the standard error of any fit - linear or not - to see if additional parameters improve the model</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.65278-1048957</guid>
  	<pubDate>Tue, 28 Aug 2007 13:32:00 -0800</pubDate>
  	<dc:creator>math_junkie</dc:creator>
</item>

    </channel>
</rss>
