# Margins of error, multiple variables, and probably overthinking stats

November 24, 2019 3:01 PM Subscribe

I need to compute a margin of error for the result of f(x,y), where x and y are variables, with known errors, provided by sensors. This is something that would go on a spec sheet, so it should be applicable to any equation f(x,y) no matter the inputs. I guess?
An example and possibly a better explanation of the wall I'm hitting behind the cut.

I have a feeling this is either impossible, or so easy that I'm overlooking a simple solution.

What I would like to do is calculate an error for f(x,y) which accounts for the known errors of x and y for any given value of x and y. Literally, I want to be able to write on a spec sheet 'Foo [the result of f(x,y)] is accurate to +/- [numerical value].'

Let's say I have sensors installed in teapots, and they report the depth of the tea in the pot (variable x), and the temperature of the tea (variable y). A sensor reading is collected at regular intervals, and the function(x,y) is applied to every record collected. We know that sensor x is accurate to +/- .25, and sensor y is accurate to +/- .5.

The function also takes inputs that we can treat as constants for the moment, because we're not worried about incorporating their errors -- let's say, they're the thickness of the teapot, and whether it has a working lid or not.

Some of these constants

So! I am able to calculate the margin of error for a single given record from a given teapot (thank you online partial derivative calculator). I broadly know how to do error propagation, but again, that seems to change as I calculate errors for each record. So, given lots of different teapots all sending their different readings, is there a way to calculate a margin of error that can then apply to any teapot with these sensors installed, and known errors for the variables x and y.

(I've been really nervous about this and doing a lot of deep dives into intensive error calculation, so if you're reading this and going 'oh, here is a stats 101 answer but I'm sure kalimac has already considered that', please offer it anyway, there's like a 50/50 chance I haven't.)

I have a feeling this is either impossible, or so easy that I'm overlooking a simple solution.

What I would like to do is calculate an error for f(x,y) which accounts for the known errors of x and y for any given value of x and y. Literally, I want to be able to write on a spec sheet 'Foo [the result of f(x,y)] is accurate to +/- [numerical value].'

Let's say I have sensors installed in teapots, and they report the depth of the tea in the pot (variable x), and the temperature of the tea (variable y). A sensor reading is collected at regular intervals, and the function(x,y) is applied to every record collected. We know that sensor x is accurate to +/- .25, and sensor y is accurate to +/- .5.

The function also takes inputs that we can treat as constants for the moment, because we're not worried about incorporating their errors -- let's say, they're the thickness of the teapot, and whether it has a working lid or not.

Some of these constants

*can*change reading to reading, and x and y

*generally*change reading to reading, and definitely change over time as the tea grows cold, or the pot is refilled. The constants are also different teapot to teapot.

So! I am able to calculate the margin of error for a single given record from a given teapot (thank you online partial derivative calculator). I broadly know how to do error propagation, but again, that seems to change as I calculate errors for each record. So, given lots of different teapots all sending their different readings, is there a way to calculate a margin of error that can then apply to any teapot with these sensors installed, and known errors for the variables x and y.

(I've been really nervous about this and doing a lot of deep dives into intensive error calculation, so if you're reading this and going 'oh, here is a stats 101 answer but I'm sure kalimac has already considered that', please offer it anyway, there's like a 50/50 chance I haven't.)

If the function has well-defined derivatives everywhere and doesn't change curvature by a large amount over the typical size of your errors, the usual propagation of uncertainty formula mention above should give you an analytic expression for the error as a function of x,y, delta-x, and delta-y that works everywhere.

If the function is non-differentiable, or the errors are large compared to the range over which f(x,y) changes significantly, I don't think there's any easy way to do this except numerically. One option is to generate a table of errors for typical values of (x,y,delta-x,delta-y). A compact option would be to make that table and then fit some arbitrary fitting function to it (say, a multi-dimensional polynomial) to get an approximate equation. If delta-f is a surface in a 4D coordinates (x,y,delta-x,delta-y), and you can fit it pretty well with some tens of variables with delta-f~G(a1, a2, a3; b1, b2, b3. . . ;x, y, delta-x, delta-y), and the result looks okay and diverges from delta-F by much less than delta-F, that's not a bad option. It lets someone plug a dozen constants and their measured values in and get a reasonable answer and it's fast to implement in software. (This might not be a terrible idea even if you do have a well defined f(x,y) if that function happens to be really gnarly and hard to write down or slow to calculate.

This all assumes Gaussian and uncorrelated errors. Also, I'm absolutely not a statistics expert, so take everything with a grain of salt.

posted by eotvos at 3:48 PM on November 24, 2019 [1 favorite]

If the function is non-differentiable, or the errors are large compared to the range over which f(x,y) changes significantly, I don't think there's any easy way to do this except numerically. One option is to generate a table of errors for typical values of (x,y,delta-x,delta-y). A compact option would be to make that table and then fit some arbitrary fitting function to it (say, a multi-dimensional polynomial) to get an approximate equation. If delta-f is a surface in a 4D coordinates (x,y,delta-x,delta-y), and you can fit it pretty well with some tens of variables with delta-f~G(a1, a2, a3; b1, b2, b3. . . ;x, y, delta-x, delta-y), and the result looks okay and diverges from delta-F by much less than delta-F, that's not a bad option. It lets someone plug a dozen constants and their measured values in and get a reasonable answer and it's fast to implement in software. (This might not be a terrible idea even if you do have a well defined f(x,y) if that function happens to be really gnarly and hard to write down or slow to calculate.

This all assumes Gaussian and uncorrelated errors. Also, I'm absolutely not a statistics expert, so take everything with a grain of salt.

posted by eotvos at 3:48 PM on November 24, 2019 [1 favorite]

If your goal is to produce an accuracy value for a spec sheet, realistically you just need an upper bound on the error.

The Wikipedia article on propagation of uncertainty is a pretty decent overview of the topic. In most cases, the uncertainty can be given by the equation

(sf)^2 = (df/dx)^2 (sx)^2 + (df/dy)^2 (sy)^2

(with apologies for lack of math formatting), where sf, sx, and sy are the uncertainties in f, x, and y, respectively, and df/dx and df/dy are the partial derivatives of f with respect to x and y, respectively. This is valid as long as sx and sy are unrelated to each other, and f is approximately linear within a neighborhood defined by sx and sy. In your case, sx and sy are "given" by the specifications of the sensors you're using, and you probably have to assume they're uncorrelated, so they're basically constants.

Now, unless f is simple linear function, df/dx and df/dy vary with x and y, so the value of sf depends on x and y. But realistically, what you want to report is probably not the

So in all likelihood, what you really want to do is calculate

U = sqrt( 0.25^2 * (max df/dx)^2 + 0.5^2 * (max df/dy)^2 )

where max denotes the maximum of the partial derivatives

Now, if you need to provide information about the maximum amount of precision provided by your device under all possible conditions, then this approach won't work. There's no single number that you can quote in this case, only give a mathematical form for the uncertainty calculated from the method above.

posted by biogeo at 4:17 PM on November 24, 2019 [5 favorites]

The Wikipedia article on propagation of uncertainty is a pretty decent overview of the topic. In most cases, the uncertainty can be given by the equation

(sf)^2 = (df/dx)^2 (sx)^2 + (df/dy)^2 (sy)^2

(with apologies for lack of math formatting), where sf, sx, and sy are the uncertainties in f, x, and y, respectively, and df/dx and df/dy are the partial derivatives of f with respect to x and y, respectively. This is valid as long as sx and sy are unrelated to each other, and f is approximately linear within a neighborhood defined by sx and sy. In your case, sx and sy are "given" by the specifications of the sensors you're using, and you probably have to assume they're uncorrelated, so they're basically constants.

Now, unless f is simple linear function, df/dx and df/dy vary with x and y, so the value of sf depends on x and y. But realistically, what you want to report is probably not the

*exact accuracy estimate*at each possible value, but rather an*upper bound on the error*that someone using your device can expect. That is, you want some uncertainty U ≥ sf for all x and y. The fact that someone could in fact have less uncertainty than this for some values of x and y is probably not important for most cases: people just want a number they know places a bound on the true value.So in all likelihood, what you really want to do is calculate

U = sqrt( 0.25^2 * (max df/dx)^2 + 0.5^2 * (max df/dy)^2 )

where max denotes the maximum of the partial derivatives

*only over the range of x and y values meaningful for your application*. So for example if one of your sensors measures the temperature of water in your hypothetical teapot, you might bound x to between 0 and 110 degrees Celsius: if df/dx takes a larger value outside this range, that doesn't matter for your application since your user will never see it.Now, if you need to provide information about the maximum amount of precision provided by your device under all possible conditions, then this approach won't work. There's no single number that you can quote in this case, only give a mathematical form for the uncertainty calculated from the method above.

posted by biogeo at 4:17 PM on November 24, 2019 [5 favorites]

I think biogeo has the general approach you'll need.

In general, your uncertainty value is going to have to take into account 3 things:

- Error of x

- Error of y

- Characteristics of the function f(x,y)

You are not going to be able to produce, say, just a single number that will be applicable to all functions, because functions are just far too variable in their characteristics.

Let me just give a simple example:

f(x,y) = 0 if x<5>

f(x,y) = 1,000,000 otherwise

So in some situation the 'real' reading for both x and y is 5, and sensor x is accurate to +/- .25, and sensor y is accurate to +/- .5.

So in this situation, the values for (x,y) randomly fluctuate just above and below (5,5), like (4.9, 5.1), (4.77, 4.97), (5.12, 5.23), (4.83, 4.97), etc.

And given a series of values of that sort, f(x,y) randomly switches between 0 and 1,000,000, just based on random sensor error, even though the actual physical situation hasn't changed at all.

Now if f is everywhere differentiable in the area of interest and the derivative is always less than some certain value, then you can plug that maximal value of the derivative into biogeo's equation and there you are.

posted by flug at 4:45 PM on November 24, 2019 [3 favorites]

In general, your uncertainty value is going to have to take into account 3 things:

- Error of x

- Error of y

- Characteristics of the function f(x,y)

You are not going to be able to produce, say, just a single number that will be applicable to all functions, because functions are just far too variable in their characteristics.

Let me just give a simple example:

f(x,y) = 0 if x<5>

f(x,y) = 1,000,000 otherwise

So in some situation the 'real' reading for both x and y is 5, and sensor x is accurate to +/- .25, and sensor y is accurate to +/- .5.

So in this situation, the values for (x,y) randomly fluctuate just above and below (5,5), like (4.9, 5.1), (4.77, 4.97), (5.12, 5.23), (4.83, 4.97), etc.

And given a series of values of that sort, f(x,y) randomly switches between 0 and 1,000,000, just based on random sensor error, even though the actual physical situation hasn't changed at all.

Now if f is everywhere differentiable in the area of interest and the derivative is always less than some certain value, then you can plug that maximal value of the derivative into biogeo's equation and there you are.

posted by flug at 4:45 PM on November 24, 2019 [3 favorites]

Small correction, and thanks to

What you want is actually

U = sqrt( 0.25^2 * (max(abs(df/dx)))^2 + 0.5^2 * (max(abs(df/dy)))^2 )

Or equivalently,

U = sqrt( 0.25^2 * max (df/dx)^2 + 0.5^2 * max (df/dy)^2 )

In other words, just the largest magnitude of the partial derivatives is what's important, irrespective of the sign.

posted by biogeo at 4:59 PM on November 24, 2019 [3 favorites]

**zengargoyle**for memailing me to point it out:What you want is actually

U = sqrt( 0.25^2 * (max(abs(df/dx)))^2 + 0.5^2 * (max(abs(df/dy)))^2 )

Or equivalently,

U = sqrt( 0.25^2 * max (df/dx)^2 + 0.5^2 * max (df/dy)^2 )

In other words, just the largest magnitude of the partial derivatives is what's important, irrespective of the sign.

posted by biogeo at 4:59 PM on November 24, 2019 [3 favorites]

Ack, sorry I forgot to check MeFi formatting. That function example should have been this:

So that shows that you definitely cannot make an error expression that only takes into account the properties of

And . . . functions with jump discontinuities are not just irrelevant abstractions. Just one example: The whole purpose of a transistor is to make something like a jump discontinuity happen to electric current.

posted by flug at 6:35 PM on November 24, 2019 [1 favorite]

So that is a specific example. But generally, any function with a jump discontinuity fits the bill. If your error in (x,y) is jumping across the discontinuity you'll get random large jumps in the output. You can make the jumps as large as you like simply by changing the function.Let me just give a simple example:

f(x,y) = 0 if x < 5 and y < 5

f(x,y) = 1,000,000 otherwise

So that shows that you definitely cannot make an error expression that only takes into account the properties of

*x*and*y*- you must also include*some*relevant properties or restrictions on the function*f*as well.And . . . functions with jump discontinuities are not just irrelevant abstractions. Just one example: The whole purpose of a transistor is to make something like a jump discontinuity happen to electric current.

posted by flug at 6:35 PM on November 24, 2019 [1 favorite]

Thank you all, more than I can say -- this has been immensely helpful, and given me a lot to work with! I seriously can't tell you how much I appreciate everyone's time on this :)

posted by kalimac at 9:08 AM on November 25, 2019 [1 favorite]

posted by kalimac at 9:08 AM on November 25, 2019 [1 favorite]

This thread is closed to new comments.

howthey co-vary. But the simplest assumption is that their errors are independent.)posted by snowmentality at 3:06 PM on November 24, 2019 [1 favorite]