Numerical Robustness - Real-Time Collision Detection

Graphics Reference

In-Depth Information

some number bases. For example, in base 10, 1/3 is not exactly representable in a fixed

number of digits; unlike, say, 0.1. However, in a binary floating-point representation

0.1 is no longer exactly representable but is instead given by the repeating fraction

(0.0001100110011 ... ) 2 . When this number is normalized and rounded off to 24 bits

(including the one bit that is not stored) the mantissa bit pattern ends in ... 11001101,

where the last least significant bit has been rounded up. The IEEE single-precision

representation of 0.1 is therefore slightly larger than 0.1. As virtually all current CPUs

use binary floating-point systems, the following code is thus extremely likely to print

“Greater than”rather than anything else.

float tenth = 0.1f;

if (tenth * 10.0f > 1.0f)

printf("Greater than\n");

else if (tenth * 10.0f < 1.0f)

printf("Less than\n");

else if (tenth * 10.0f == 1.0f)

printf("Equal\n");

else

printf("Huh?\n");

That some numbers are not exactly representable also means that whereas

replacing x/2.0f with x*0.5f is exact (as both 2.0 and 0.5 are exactly representable)

replacing x/10.0f with x*0.1f is not. As multiplications are generally faster than

divisions (up to about a magnitude, but the gap is closing on current architectures),

the latter replacement is still frequently and deliberately performed for reasons of

efficiency.

It is important to realize that floating-point arithmetic does not obey ordinary

arithmetic rules. For example, round-off errors may cause a small but nonzero value

added to or subtracted from a large value to have no effect. Therefore, mathematically

equivalent expressions can produce completely different results when evaluated using

floating-point arithmetic.

Consider the expression 1.5e3 + 4.5e-6 . In real arithmetic, this expression corre-

sponds to 1500.0

+

0.0000045, which equals 1500.0000045. Because single-precision

floats can hold only about seven decimal digits, the result is truncated to 1500.0 and

digits are lost. Thus, in floating-point arithmetic a+b can equal a even though b is

nonzero and both a and b can be expressed exactly!

A consequence of the presence of truncation errors is that floating-point arith-

metic is not associative. In other words, (a+b)+c is not necessarily the same as

a+(b+c) . Consider the following three values of a , b , and c .

float a = 9876543.0f;

float b = -9876547.0f;

float c = 3.45f;

Search WWH ::

Custom Search

Home