Excel arithmetic glitch explanation
February 29, 2016 4:23 AM Subscribe
This isn't getting in the way of anything practical, but it's niggling that I don't understand the technicalities.
An accounting spreasheet I've used daily for years has thrown up an Excel glitch... from some googling it seems to be something related to the way floating-point arithmetic is handled as per this article.
The specific example is:
=34606.95-31740.06-2866.89 calculates to -4.09272615797818E-12
but a re-ordering
=34606.95-2866.89-31740.06 calculates, as it should, to 0
This isn't remotely a practical problem, but can you please explain (to someone keen but with no formal cs background) the specifics of what's going on here?
An accounting spreasheet I've used daily for years has thrown up an Excel glitch... from some googling it seems to be something related to the way floating-point arithmetic is handled as per this article.
The specific example is:
=34606.95-31740.06-2866.89 calculates to -4.09272615797818E-12
but a re-ordering
=34606.95-2866.89-31740.06 calculates, as it should, to 0
This isn't remotely a practical problem, but can you please explain (to someone keen but with no formal cs background) the specifics of what's going on here?
Best answer: Nah, Excel is just rounding to what you expect in the second version. If you turned on full precision formatting, you'd get this: 34606.95-2866.89-31740.06 = -3.637978807091713e-12
There are two things going on here. First is that when you enter a base-10 decimal, the computer stores it as the nearest number that can actually be represented in the base-2 floating point number format. It can't represent 34606.95, you're actually getting something closer to 34606.949999999997. This happens for the same reason that I can represent 1/3 exactly if I write it in base-3 (it's "0.1"), but in base-10, I can only get an approximation with any finite number of digits: 0.3333333.
Excel is lying to you when it prints the number as 34606.95. It's rounding to some number of significant digits. There's probably a formatting option somewhere that will give you closer to the truth.
The second thing happening is catastrophic cancellation. The tl;dr is that when you subtract two floating point numbers of the same magnitude, it trashes your precision, and the least significant digits burble up to the surface looking like noise.
Real numerical code that wants to test if a floating point number is zero will always bound it so that any sufficiently small number is zero. E.g., instead of "if x == 0", you'll have "if abs(x) < epsilon", where epsilon is some very small number.
posted by qxntpqbbbqxl at 5:05 AM on February 29, 2016 [9 favorites]
There are two things going on here. First is that when you enter a base-10 decimal, the computer stores it as the nearest number that can actually be represented in the base-2 floating point number format. It can't represent 34606.95, you're actually getting something closer to 34606.949999999997. This happens for the same reason that I can represent 1/3 exactly if I write it in base-3 (it's "0.1"), but in base-10, I can only get an approximation with any finite number of digits: 0.3333333.
Excel is lying to you when it prints the number as 34606.95. It's rounding to some number of significant digits. There's probably a formatting option somewhere that will give you closer to the truth.
The second thing happening is catastrophic cancellation. The tl;dr is that when you subtract two floating point numbers of the same magnitude, it trashes your precision, and the least significant digits burble up to the surface looking like noise.
Real numerical code that wants to test if a floating point number is zero will always bound it so that any sufficiently small number is zero. E.g., instead of "if x == 0", you'll have "if abs(x) < epsilon", where epsilon is some very small number.
posted by qxntpqbbbqxl at 5:05 AM on February 29, 2016 [9 favorites]
Best answer: in "simple" terms, it's because the way computers count isn't like you'd expect.
for example, you know that if you're working in decimals then you cannot write down "a third" exactly. you can write down 0.333 or 0.3333333333333 or 0.333333333333333333333333333333333333 but none of them are exactly a third, and if you multiply them by 3 you won't get 1.
what's weird about this (the "deep" part, which is not simple) is that this isn't "the fault" of the number 3. it's the number 10 that's to blame. we count in 10s. and if you count in 10s then "a third" is a problem. but if we had 12 fingers instead of 10, and counted in 12s, then (trust me!) we could write down a third in the 12-equivalent of decimals without any problems.
and, if you are still with me, it turns out computers have 2 fingers, and count in 2s. and that means other numbers are "broken". hence your problem.
posted by andrewcooke at 5:34 AM on February 29, 2016 [4 favorites]
for example, you know that if you're working in decimals then you cannot write down "a third" exactly. you can write down 0.333 or 0.3333333333333 or 0.333333333333333333333333333333333333 but none of them are exactly a third, and if you multiply them by 3 you won't get 1.
what's weird about this (the "deep" part, which is not simple) is that this isn't "the fault" of the number 3. it's the number 10 that's to blame. we count in 10s. and if you count in 10s then "a third" is a problem. but if we had 12 fingers instead of 10, and counted in 12s, then (trust me!) we could write down a third in the 12-equivalent of decimals without any problems.
and, if you are still with me, it turns out computers have 2 fingers, and count in 2s. and that means other numbers are "broken". hence your problem.
posted by andrewcooke at 5:34 AM on February 29, 2016 [4 favorites]
Fascinating. I am using the new Office 2016 and neither of these show a value; meaning the cell remains blank regardless of formatting. And down the rabbit hole I go....
posted by fluffycreature at 6:31 AM on February 29, 2016
posted by fluffycreature at 6:31 AM on February 29, 2016
Best answer: This Computerphile video gives a reasonably good explanation of how floating point numbers work. (At least, I hope --can't verify that link right now.)
posted by yuwtze at 6:35 AM on February 29, 2016
posted by yuwtze at 6:35 AM on February 29, 2016
Response by poster: Thank you all for the insight, it's fascinating to learn about the underlying complexity that's required to compute something with the same outcome that I casually think of as "simple" when run in meatspace.
posted by protorp at 11:01 AM on February 29, 2016
posted by protorp at 11:01 AM on February 29, 2016
FWIW my first personal encounter with this kind of problem was back in about 1980 when I attended local Apple ][ computer club meeting about an Apple ][ program that was used as inventory & accounting for a small business. The owner reported that this particular program worked perfectly except for one small glith: Dollar values of $8.12 were recorded as $8.11
The system programmers were stumped as to the cause and the program gave correct values for every other dollar & cents value but this one.
The problem turned out to be they used a routine to store the dollar/cents amounts like this:
int (100*$amount)
In Applesoft, 100*8.12=811.9999998 or thereabouts. So when you use the int() function on that value, you get an answer of 811 - not the expected 812.
But to make matters worse, Applesoft rounded the answer for display purposes, so you get contradictory-seeming results like these:
print (8.12*100)
812
print int(8.12*100)
811
print (812-(8.12*100))
2.68220901E-07
To make matters worse, reversing the order of the multiplicands changes the answer:
print (812-(100*8.12))
0
Moral of the story: Expect the results of floating point operations to deviate from the expected in the last significant digit (or more, especially if you’re chaining operations) and to not necessarily display the discrepancy.
FYI back in the day I tested this up to many millions or billions of dollars (I can’t remember exactly but I let a loop run for pretty much all day on it) and 8.12 was the only two-digit decimal to have this problem in Applesoft.
You can still see the bug at work today on Apple II emulators like this one or this one.
posted by flug at 2:29 PM on February 29, 2016 [4 favorites]
The system programmers were stumped as to the cause and the program gave correct values for every other dollar & cents value but this one.
The problem turned out to be they used a routine to store the dollar/cents amounts like this:
int (100*$amount)
In Applesoft, 100*8.12=811.9999998 or thereabouts. So when you use the int() function on that value, you get an answer of 811 - not the expected 812.
But to make matters worse, Applesoft rounded the answer for display purposes, so you get contradictory-seeming results like these:
print (8.12*100)
812
print int(8.12*100)
811
print (812-(8.12*100))
2.68220901E-07
To make matters worse, reversing the order of the multiplicands changes the answer:
print (812-(100*8.12))
0
Moral of the story: Expect the results of floating point operations to deviate from the expected in the last significant digit (or more, especially if you’re chaining operations) and to not necessarily display the discrepancy.
FYI back in the day I tested this up to many millions or billions of dollars (I can’t remember exactly but I let a loop run for pretty much all day on it) and 8.12 was the only two-digit decimal to have this problem in Applesoft.
You can still see the bug at work today on Apple II emulators like this one or this one.
posted by flug at 2:29 PM on February 29, 2016 [4 favorites]
This thread is closed to new comments.
posted by richb at 4:57 AM on February 29, 2016