# Regression question....

February 22, 2010 9:21 PM Subscribe

How do I estimate the coefficients of this curve using linear regression?

I have a relationship between two variables that is approximated using y = Ax^b. How do I test this is valid using linear regression?

I've confused myself thoroughly about this, unfortunately. For previous relationships that were exponential (y = A exp(bx), I was able to take the natural log of both sides, and then use the slope and intercept from the resulting regression to estimate the parameters. For this relationship, I presume I need to take the log of both sides, but I can't figure out how to estimate what the log base is.

Thank you in advance - I was sort of tempted to ask this anonymously as I'm sure I'm missing something obvious here but I'm drawing a blank

I have a relationship between two variables that is approximated using y = Ax^b. How do I test this is valid using linear regression?

I've confused myself thoroughly about this, unfortunately. For previous relationships that were exponential (y = A exp(bx), I was able to take the natural log of both sides, and then use the slope and intercept from the resulting regression to estimate the parameters. For this relationship, I presume I need to take the log of both sides, but I can't figure out how to estimate what the log base is.

Thank you in advance - I was sort of tempted to ask this anonymously as I'm sure I'm missing something obvious here but I'm drawing a blank

Best answer: If you have access to R, the

posted by Blazecock Pileon at 10:10 PM on February 22, 2010

`lm`

(linear model), `coef`

and `summary`

functions will give you what you need:`> x <- c(0,1,2,3,4,5)`

`> y <- c(2,6,10,14,18,22)`

`> linearRegression <- lm(y ~ x)`

```
> coef(linearRegression)
```(Intercept) x
2 4

`> summary(linearRegression)`

```
Call:
lm(formula = y ~ x)
Residuals:
1 2 3 4 5 6
-3.343e-16 3.201e-15 -2.578e-15 -1.800e-15 1.997e-16 1.311e-15
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.000e+00 1.697e-15 1.178e+15 <2>
x 4.000e+00 5.606e-16 7.135e+15 <2>
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.345e-15 on 4 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 5.09e+31 on 1 and 4 DF, p-value: <>
```

posted by Blazecock Pileon at 10:10 PM on February 22, 2010

Response by poster: Thank you rancidchickn and Blazecock Pileon - I do have access to R, and have figured out what was confusing me and how to right it. I was getting confused with what the various parts of the transformed equation, and had spent quite a while trying to sort it out.

posted by a womble is an active kind of sloth at 10:23 PM on February 22, 2010

posted by a womble is an active kind of sloth at 10:23 PM on February 22, 2010

Box-Cox might get you the transformation you need, if I understand the question correctly.

The Minitab Box-Cox is friendlier than R, if you have access.

posted by degrees_of_freedom at 5:54 AM on February 23, 2010

The Minitab Box-Cox is friendlier than R, if you have access.

posted by degrees_of_freedom at 5:54 AM on February 23, 2010

In R, you can specify some transformations in the formula. Such as, lm(log(y) ~ log(x), data = mydata) where mydata is a dataframe with your x and y values in it.

Another question is whether you should transform the data, or use a link function (a generalized linear model). If the error or noise is multiplicative (things that make y grow faster or slower), use a transformation. If the noise is additive (many kinds of measurement error are), then you should use a log link. A question that you can ask to determine which scenario you're in is "could I observe negative values of y?". If so, you at least have an additive component. You'd fit that with

glm(y~log(x), family=quasi(link=log,variance="constant"), data=mydata).

You can get more information about your choices of link and variance functions with ?family.

posted by a robot made out of meat at 7:49 AM on February 23, 2010 [1 favorite]

Another question is whether you should transform the data, or use a link function (a generalized linear model). If the error or noise is multiplicative (things that make y grow faster or slower), use a transformation. If the noise is additive (many kinds of measurement error are), then you should use a log link. A question that you can ask to determine which scenario you're in is "could I observe negative values of y?". If so, you at least have an additive component. You'd fit that with

glm(y~log(x), family=quasi(link=log,variance="constant"), data=mydata).

You can get more information about your choices of link and variance functions with ?family.

posted by a robot made out of meat at 7:49 AM on February 23, 2010 [1 favorite]

Response by poster: I was interested to see that so many people favorited this question. I thought I would share a useful pdf I found that answered what I was trying to do as well.

Hope it is of assistance, and I did really appreciate the answers. I was working late on trying to get some work finished and and tied myself in a knot.

posted by a womble is an active kind of sloth at 9:36 AM on February 27, 2010 [1 favorite]

Hope it is of assistance, and I did really appreciate the answers. I was working late on trying to get some work finished and and tied myself in a knot.

posted by a womble is an active kind of sloth at 9:36 AM on February 27, 2010 [1 favorite]

This thread is closed to new comments.

ln(y) = ln(A) + bln(x)

If you plot this with both x and y being on a logarithmic scale (log-log), it should yield a straight line, where b is the slope and ln(A) is the intercept.

posted by rancidchickn at 10:02 PM on February 22, 2010