Using Excel's regression tools to determine the 95% confidence interval
January 26, 2014 12:48 PM Subscribe
MS Excel's regression tools provide 95% lower/upper confidence results but how does one properly interpret and then express those as a single ± (plus/minus) figure?
The in-house literature I am now required to read at work includes frequent references to linear trends, normally expressed as a trend per decade with a '±' figure identified as the confidence interval at 95%. The underlying datapoints used to calculate the liner trend are monthly measurements (1 per month).
I now have been requested to write a memo summarizing the "statistics" of the latest dataset of 120 months. At our office, the statistics part is accomplished using MS Excel. I have taught myself how to use the Data Analysis Tools in Excel to calculate the linear trend and other regression statistics.
So, I now have the Excel stats for the newest dataset but I am not sure how to properly express them in a way that avoids my looking like a statistics idiot (which I am). My main issue is I am not sure what to base the ± confidence interval on. The internal memos use a single number after the '±' (e.g. ±0.9) but the Excel regression analysis does not do it that simply;
Instead, Excel provides the following:
--------Coeff.---Std.Er.---Lwr95%-Upr95%-Lwr90%-Upr90%
Intcpt -0.2394, 0.0266, -0.2918, -0.1871, -0.2833, -0.1955
Data.X 0.0013, 0.0002,, 0.0009,, 0.0017 ,, 0.0009 ,, 0.0016
I know that the 0.0013 is the slope of the linear trend of monthly observations and to calculate a 10-year trend I would multiply by 120 (months). But it is not clear what numbers I would use to calculate a single '±' that is interpreted as the 95% confidence interval.
Are Excel's 'Lower' and 'Upper' 95% results the basis for the confidence interval? If so, which are the correct numbers to use, and would I then multiply that correct number(s) by 120 as done with the slope coefficient when expressed as a 10-year trend?
Thanks for clearing up an idiot's confusion on this. (Apologize for extra commas used to force columns of data to line up.)
The in-house literature I am now required to read at work includes frequent references to linear trends, normally expressed as a trend per decade with a '±' figure identified as the confidence interval at 95%. The underlying datapoints used to calculate the liner trend are monthly measurements (1 per month).
I now have been requested to write a memo summarizing the "statistics" of the latest dataset of 120 months. At our office, the statistics part is accomplished using MS Excel. I have taught myself how to use the Data Analysis Tools in Excel to calculate the linear trend and other regression statistics.
So, I now have the Excel stats for the newest dataset but I am not sure how to properly express them in a way that avoids my looking like a statistics idiot (which I am). My main issue is I am not sure what to base the ± confidence interval on. The internal memos use a single number after the '±' (e.g. ±0.9) but the Excel regression analysis does not do it that simply;
Instead, Excel provides the following:
--------Coeff.---Std.Er.---Lwr95%-Upr95%-Lwr90%-Upr90%
Intcpt -0.2394, 0.0266, -0.2918, -0.1871, -0.2833, -0.1955
Data.X 0.0013, 0.0002,, 0.0009,, 0.0017 ,, 0.0009 ,, 0.0016
I know that the 0.0013 is the slope of the linear trend of monthly observations and to calculate a 10-year trend I would multiply by 120 (months). But it is not clear what numbers I would use to calculate a single '±' that is interpreted as the 95% confidence interval.
Are Excel's 'Lower' and 'Upper' 95% results the basis for the confidence interval? If so, which are the correct numbers to use, and would I then multiply that correct number(s) by 120 as done with the slope coefficient when expressed as a 10-year trend?
Thanks for clearing up an idiot's confusion on this. (Apologize for extra commas used to force columns of data to line up.)
For 95%= 1.96*standard error is the +/- that you are looking for.
That will change for 90% 1.645*standard error.
So for the Data X . 0013 +/- .0004. You get upper 95% by [1.96(~2)*.0002] or.004+0013=.0017,Lower 95% .0013-.0004 = .0009
posted by radsqd at 1:48 PM on January 26, 2014 [1 favorite]
That will change for 90% 1.645*standard error.
So for the Data X . 0013 +/- .0004. You get upper 95% by [1.96(~2)*.0002] or.004+0013=.0017,Lower 95% .0013-.0004 = .0009
posted by radsqd at 1:48 PM on January 26, 2014 [1 favorite]
For 95%= 1.96*standard error is the +/- that you are looking for.
Only for large N's (120 is large enough, but other work or projects might not be).
The more general way to get this from Excel output you gave is
Coeff +/- (Coeff-Lwr95%)
posted by ROU_Xenophobe at 2:10 PM on January 26, 2014
Only for large N's (120 is large enough, but other work or projects might not be).
The more general way to get this from Excel output you gave is
Coeff +/- (Coeff-Lwr95%)
posted by ROU_Xenophobe at 2:10 PM on January 26, 2014
Response by poster: Thanks, all. I was able to get my hands on an older dataset and then ran it through Excel. I then compared those regression results with previous internal memo confirming that indeed previous authors had expressed the confidence interval '±' using the same techniques suggested.
Your responses completely cleared my confusion and provided a needed comfort level.
One follow up question. If I multiply the slope coefficient by 120 to calculate an estimated 10-year trend, would I do the same for the '±' figure. Using example, ±0.004 would become ±0.48 - sound right?
Thanks again.
posted by kartguy at 9:54 AM on January 27, 2014
Your responses completely cleared my confusion and provided a needed comfort level.
One follow up question. If I multiply the slope coefficient by 120 to calculate an estimated 10-year trend, would I do the same for the '±' figure. Using example, ±0.004 would become ±0.48 - sound right?
Thanks again.
posted by kartguy at 9:54 AM on January 27, 2014
This thread is closed to new comments.
posted by Hollywood Upstairs Medical College at 12:56 PM on January 26, 2014