System . out . println ( "Duration: " + ( now - then ) + " ms" );
There are four separate threads here, and they are not sharing any variables: each is accessing
only a single member of the DataHolder class. From a synchronization standpoint, there is
no contention, and we might reasonably expect that this code would execute in the same
amount of time regardless of whether it runs one thread or four threads (given a four-core
It doesn't turn out that way—when one particular thread writes the volatile value in its
loop, the cache line for every other thread will get invalidated, and the memory values must
be reloaded. Table 9-8 shows the result: performance gets worse as more threads are added.
Table 9-8. Time to sum 1,000,000 values with false sharing
Number of threads Elapsed time
Strictly speaking, false sharing does not have to involve synchronized (or volatile ) vari-
ables: whenever any data value in the CPU cache is written, other caches that hold the same
data range must be invalidated. However, remember that the Java memory model requires
that the data must be written to main memory only at the end of a synchronization primitive
(including CAS and volatile constructs). So that is the situation where it will be en-
countered most frequently. If, in this example, the long variables are not volatile , then the
compiler will hold the values in registers, and the test will execute in about 7.1 seconds re-
gardless of the number of threads involved.
This is obviously an extreme example, but it brings up the question of how false sharing can
be detected and corrected. Unfortunately, the answer is murky and incomplete. Nothing in
the standard set of tools discussed in Chapter 3 addresses false sharing, since it requires very
specific knowledge about the architecture of a processor.