Java Reference
In-Depth Information
5.7 Minimizing Numeric Errors
Using floating-point numbers in the loop continuation condition may cause numeric errors.
Key
Point
Numeric errors involving floating-point numbers are inevitable, because floating-point num-
bers are represented in approximation in computers by nature. This section discusses how to
minimize such errors through an example.
ListingĀ 5.8 presents an example summing a series that starts with 0.01 and ends with 1.0 .
The numbers in the series will increment by 0.01 , as follows: 0.01 + 0.02 + 0.03 , and so on.
VideoNote
Minimize numeric errors
L ISTING 5.8
TestSum.java
1 public class TestSum {
2
public static void main(String[] args) {
3
// Initialize sum
4
float sum = 0 ;
5
6 // Add 0.01, 0.02, ..., 0.99, 1 to sum
7 for ( float i = 0.01f ; i <= 1.0f ; i = i + 0.01f )
8 sum += i;
9
10 // Display result
11 System.out.println( "The sum is " + sum);
12 }
13 }
loop
The sum is 50.499985
The for loop (lines 7-8) repeatedly adds the control variable i to sum . This variable, which
begins with 0.01 , is incremented by 0.01 after each iteration. The loop terminates when i
exceeds 1.0 .
The for loop initial action can be any statement, but it is often used to initialize a control
variable. From this example, you can see that a control variable can be a float type. In fact,
it can be any data type.
The exact sum should be 50.50 , but the answer is 50.499985 . The result is imprecise
because computers use a fixed number of bits to represent floating-point numbers, and thus
they cannot represent some floating-point numbers exactly. If you change float in the pro-
gram to double , as follows, you should see a slight improvement in precision, because a
double variable holds 64 bits, whereas a float variable holds 32 bits.
double precision
// Initialize sum
double sum = 0 ;
// Add 0.01, 0.02, ..., 0.99, 1 to sum
for ( double i = 0.01 ; i <= 1.0 ; i = i + 0.01 )
sum += i;
However, you will be stunned to see that the result is actually 49.50000000000003 . What
went wrong? If you display i for each iteration in the loop, you will see that the last i is
slightly larger than 1 (not exactly 1 ). This causes the last i not to be added into sum . The
fundamental problem is that the floating-point numbers are represented by approximation. To
fix the problem, use an integer count to ensure that all the numbers are added to sum . Here is
the new loop:
numeric error
double currentValue = 0.01 ;
for ( int count = 0 ; count < 100 ; count++) {
 
 
 
Search WWH ::




Custom Search