How do I find a mean response confidence interval in nonlinear regression?
November 7, 2011 9:52 PM
How do I calculate the confidence interval for the mean response of some general nonlinear fit?
In all my time googling I have only been able to turn up documentation on what functiont to call to do this with every piece of statistical software imaginable. I want some details of what goes on inside those black boxes.
I took a course on regression and so I know how to do this with linear regression, but the prof really glossed over the nonlinear part. The main equation we were left with for calculating the variance of the mean response at some x doesn't work if the number of parameters is different from the number of regressors (since the matrices would be incorrectly sized to multiply together).
For your edification he claimed the variance was: x'*covB*x where covB is the covariance matrix of the parameters of the fit and x is the column vector at the point one wants to find the mean response at.
If, for example, I wanted to fit (as I do right now) a rate function such that y=a*x1^b*x2^c then that equation for variance doesn't work since x is 1x2 but covB will be 3x3.
In all my time googling I have only been able to turn up documentation on what functiont to call to do this with every piece of statistical software imaginable. I want some details of what goes on inside those black boxes.
I took a course on regression and so I know how to do this with linear regression, but the prof really glossed over the nonlinear part. The main equation we were left with for calculating the variance of the mean response at some x doesn't work if the number of parameters is different from the number of regressors (since the matrices would be incorrectly sized to multiply together).
For your edification he claimed the variance was: x'*covB*x where covB is the covariance matrix of the parameters of the fit and x is the column vector at the point one wants to find the mean response at.
If, for example, I wanted to fit (as I do right now) a rate function such that y=a*x1^b*x2^c then that equation for variance doesn't work since x is 1x2 but covB will be 3x3.
Statistical calculations on a nonlinear fit are a notoriously thorny subject. Monte Carlo is really the best way to go, there is good discussion of this in Numerical Recipes.
posted by Dr Dracator at 4:09 AM on November 8, 2011
posted by Dr Dracator at 4:09 AM on November 8, 2011
I'm probably misreading his notes. When he did the topic on nonlinear regression he stopped doing any calculations by hand (for obvious reasons) and instead just showed us how to do everything in matlab.
Which probaby is my answer: I'm trying to do this work for a lab course using octave because I am cheap. I should probably just buy matlab and forever live in awe at the mystery of the nlpredci() function.
posted by selenized at 8:50 AM on November 8, 2011
Which probaby is my answer: I'm trying to do this work for a lab course using octave because I am cheap. I should probably just buy matlab and forever live in awe at the mystery of the nlpredci() function.
posted by selenized at 8:50 AM on November 8, 2011
cheap!=not powerful - R can probably do whatever your course requires and then some.
posted by Dr Dracator at 10:00 AM on November 8, 2011
posted by Dr Dracator at 10:00 AM on November 8, 2011
I was thinking of just going with matlab as I have a bunch of scripts to do most of the work for me already written for matlab.
What I was hoping to do originally was just find a simple way of filling in the compatibility gaps between matlab and octave, so I wouldn't have to actually buy matlab. This was pretty easy with functions like nlinfit() as octave has a similar function leasqr(). This strategy fell apart with nlpredci() (the topic of this question) since I don't know what it actually does under the hood.
posted by selenized at 10:17 AM on November 8, 2011
What I was hoping to do originally was just find a simple way of filling in the compatibility gaps between matlab and octave, so I wouldn't have to actually buy matlab. This was pretty easy with functions like nlinfit() as octave has a similar function leasqr(). This strategy fell apart with nlpredci() (the topic of this question) since I don't know what it actually does under the hood.
posted by selenized at 10:17 AM on November 8, 2011
There are many ways to do this. The most popular are asymptotic ML arguments like Wald intervals using the observed (finite difference when necessary) hessian of -2l in place of the fisher info, or pivoted LR tests.
Bayesian methods, bootstrapping, and estimating equations are also common, but not likely the default.
posted by a robot made out of meat at 11:47 AM on November 8, 2011
Bayesian methods, bootstrapping, and estimating equations are also common, but not likely the default.
posted by a robot made out of meat at 11:47 AM on November 8, 2011
Oh, I see mean response instead of parameter. Usually delta methods calculations.
posted by a robot made out of meat at 11:48 AM on November 8, 2011
posted by a robot made out of meat at 11:48 AM on November 8, 2011
Reference from nlpredci:
[1] Seber, G.A.F, and Wild, C.J. (1989) Nonlinear Regression, Wiley.
I presume their algorithm comes form there.
posted by FrereKhan at 4:17 AM on November 9, 2011
[1] Seber, G.A.F, and Wild, C.J. (1989) Nonlinear Regression, Wiley.
I presume their algorithm comes form there.
posted by FrereKhan at 4:17 AM on November 9, 2011
I have a dumb question:
What would be wrong with estimating the variance in the mean response using error propagation techniques? If my back of the envelope calculations work it should be something like:
Jx*covB*Jx'
where Jx is the jacobian (w.r.t. the parameters) evaluated at point x, and covB is the covariance matrix of the parameters.
It at least looks fairly simple.
posted by selenized at 10:43 AM on November 9, 2011
What would be wrong with estimating the variance in the mean response using error propagation techniques? If my back of the envelope calculations work it should be something like:
Jx*covB*Jx'
where Jx is the jacobian (w.r.t. the parameters) evaluated at point x, and covB is the covariance matrix of the parameters.
It at least looks fairly simple.
posted by selenized at 10:43 AM on November 9, 2011
That is exactly the delta method.
posted by a robot made out of meat at 11:29 AM on November 9, 2011
posted by a robot made out of meat at 11:29 AM on November 9, 2011
Oh well, ha ha. I only glanced at the wikipedia page for delta method and, well, it was clearly not written with me in the intended audience.
posted by selenized at 1:38 PM on November 9, 2011
posted by selenized at 1:38 PM on November 9, 2011
This thread is closed to new comments.
I don't see how his general approach would work for an arbitrary non-linear function. In your example, the variances of a, b, and c each have very different effects on the variance of y.
I would do a bootstrap-type mote-carlo simulation to estimate the confidence interval directly, but that's just because I can't suggest a smarter way to work it out directly.
posted by FrereKhan at 1:57 AM on November 8, 2011