Help me calculate a hidden value in my data
April 3, 2006 6:18 PM
Subscribe
I've spent the past day or so trying to figure out how to calculate a hidden value in my data. It varies linearly with time, or closely enough that we can make that assumption within any given dataset. Where this goes beyond a simple linear regression is that each datapoint is known to be above or below the hidden value. The data is collected in a C# program and processed in Excel; an Excel function would be the ideal solution.
A chart makes the problem clear:
The Y axis is the time the reading was collected, the X axis is the value of the reading. We need the slope and offset of the narrow empty band running between the two datasets. This data is fairly clean; we ideally want something that will work even if the readings aren't clustered right up along the hidden value.
It's easy to eyeball the answer, but we need to have this calculated automatically. (Technically, we need to cancel it out, so that the chart above would follow a straight vertical line.) We've gotten passable results by doing a linear regression across the entire dataset, but it's possible to get big outliers which would unacceptably skew the results. We've considered trying to filter out the outliers as well, but I'm convinced that this can be calculated directly.
Any statistics gurus out there?
posted by bjrubble to technology (14 comments total)
1 user marked this as a favorite
posted by unSane at 6:23 PM on April 3, 2006