# Statistics: tell me about r-square for nonlinear models

June 21, 2007 2:36 AM Subscribe

StatisticsFilter: Can someone explain - with examples - why R^2 (r-square) values are not appropriate for use with non-linear regression models.

I am told that the reason that r-square values are not recommended for non-linear regression is that they can mislead. R^2 for linear models are bounded between 0 and 1 and can be interpreted as the proportion of variance explained by the model. R^2 for non-linear models, I am told, can be outside these bounds and therefore cannot be interpreted in the same way.

I'd like to know how this exceeding the bounding thing works. (I am using R if that makes giving examples easier). Many thanks!!

I am told that the reason that r-square values are not recommended for non-linear regression is that they can mislead. R^2 for linear models are bounded between 0 and 1 and can be interpreted as the proportion of variance explained by the model. R^2 for non-linear models, I am told, can be outside these bounds and therefore cannot be interpreted in the same way.

I'd like to know how this exceeding the bounding thing works. (I am using R if that makes giving examples easier). Many thanks!!

If there is no linear relationship between the two variables, the correlation coefficient [r] is close to 0. [This] does not mean that there isn't any type of relationship between two variables; it is possible for two variables to have an r of 0 and be strongly related in a non-linear way.SPSS 12.0 Guide to Data Analysis, Marija J. Norusis

I just closed the damn book so I don't have the page # for the quote.

posted by desjardins at 5:31 AM on June 21, 2007

Best answer: You mean R2 in models like logit or probit, then yeah.

From dim memory and quick googling:

R2 is 1- sum of squared residuals / sum of squared deviations, or 1-SSR/SSD.

This works in OLS because the total sum of squares (SDD) is the regression sum of squares plus the sum of squared deviations, or Total SSD = regression SSD + SSR.

But this identity doesn't hold in nonlinear models, so R2 gets fucked up. Simple as that.

Don't despair. If you want a rough goodness of fit measure for an MLE model, the chi-2 statistic (the ratio of a null model's likelihood to the full model's likelihood) can serve similar to an OLS model's F statistic. Or you can just use more direct measures of predictive success. Or, MLE procedures in many statistical packages can generate pseudo-R2s that are varying degrees of crap.

R2 is overrated anyway. While the common interpretation is that it's the percent of variance that's explained (understandable given the identity), a more accurate interpretation is that it's how much more variation you're explaining than a null model with only a constant would explain. That is, how much better is your model than a model that explains the DV with its own mean. But there are two parts here -- how good your model is, and how good the DV's mean does at explaining the DV. The better the mean does, the worse your R2 will look even though your model is good.

Also, R2 always increases with added terms (at least in OLS). Want a higher R2? Add junk regressors. Want an R2 of 1.00? Have N-1 regressors (IIRC) and leave the dataset fully identified.

posted by ROU_Xenophobe at 8:27 AM on June 21, 2007 [1 favorite]

From dim memory and quick googling:

R2 is 1- sum of squared residuals / sum of squared deviations, or 1-SSR/SSD.

This works in OLS because the total sum of squares (SDD) is the regression sum of squares plus the sum of squared deviations, or Total SSD = regression SSD + SSR.

But this identity doesn't hold in nonlinear models, so R2 gets fucked up. Simple as that.

Don't despair. If you want a rough goodness of fit measure for an MLE model, the chi-2 statistic (the ratio of a null model's likelihood to the full model's likelihood) can serve similar to an OLS model's F statistic. Or you can just use more direct measures of predictive success. Or, MLE procedures in many statistical packages can generate pseudo-R2s that are varying degrees of crap.

R2 is overrated anyway. While the common interpretation is that it's the percent of variance that's explained (understandable given the identity), a more accurate interpretation is that it's how much more variation you're explaining than a null model with only a constant would explain. That is, how much better is your model than a model that explains the DV with its own mean. But there are two parts here -- how good your model is, and how good the DV's mean does at explaining the DV. The better the mean does, the worse your R2 will look even though your model is good.

Also, R2 always increases with added terms (at least in OLS). Want a higher R2? Add junk regressors. Want an R2 of 1.00? Have N-1 regressors (IIRC) and leave the dataset fully identified.

posted by ROU_Xenophobe at 8:27 AM on June 21, 2007 [1 favorite]

Response by poster: Thanks ROU! That's just what I wanted.

I will use cross-validation instead of an R^2.

posted by jonesor at 10:18 AM on June 21, 2007 [1 favorite]

I will use cross-validation instead of an R^2.

posted by jonesor at 10:18 AM on June 21, 2007 [1 favorite]

for non-linear models, the R2 can become negative - not exactly intuitive but nevertheless useless. Another possibility is to examine the standard error of any fit - linear or not - to see if additional parameters improve the model

posted by math_junkie at 1:32 PM on August 28, 2007

posted by math_junkie at 1:32 PM on August 28, 2007

« Older I'm not a portland squatter... | True Crime message boards make me want to commit a... Newer »

This thread is closed to new comments.

Nice simple example of non-linear regression is logistic regression - a special case of non-linear regression. maybe you should have a look at that to help get your head around it a bit better.

posted by singingfish at 4:10 AM on June 21, 2007