Hardware Reference
In-Depth Information
5.3 Performance of Symmetric Shared-Memory
Multiprocessors
In a multicore using a snooping coherence protocol, several different phenomena combine to
determine performance. In particular, the overall cache performance is a combination of the
behavior of uniprocessor cache miss traffic and the traffic caused by communication, which
results in invalidations and subsequent cache misses. Changing the processor count, cache
size, and block size can affect these two components of the miss rate in different ways, leading
to overall system behavior that is a combination of the two efects.
Appendix B breaks the uniprocessor miss rate into the three C's classification (capacity,
compulsory, and conflict) and provides insight into both application behavior and potential
improvements to the cache design. Similarly, the misses that arise from interprocessor com-
munication, which are often called coherence misses , can be broken into two separate sources.
The first source is the so-called true sharing misses that arise from the communication of data
through the cache coherence mechanism. In an invalidation-based protocol, the first write by a
processor to a shared cache block causes an invalidation to establish ownership of that block.
Additionally, when another processor atempts to read a modiied word in that cache block, a
miss occurs and the resultant block is transferred. Both these misses are classified as true shar-
ing misses since they directly arise from the sharing of data among processors.
The second effect, called false sharing , arises from the use of an invalidation-based coherence
algorithm with a single valid bit per cache block. False sharing occurs when a block is inval-
idated (and a subsequent reference causes a miss) because some word in the block, other than
the one being read, is writen into. If the word writen into is actually used by the processor
that received the invalidate, then the reference was a true sharing reference and would have
caused a miss independent of the block size. If, however, the word being written and the word
read are different and the invalidation does not cause a new value to be communicated, but
only causes an extra cache miss, then it is a false sharing miss. In a false sharing miss, the block
is shared, but no word in the cache is actually shared, and the miss would not occur if the
block size were a single word. The following example makes the sharing patterns clear.
Example
Assume that words x1 and x2 are in the same cache block, which is in the
shared state in the caches of both P1 and P2. Assuming the following sequence
of events, identify each miss as a true sharing miss, a false sharing miss, or a hit.
Any miss that would occur if the block size were one word is designated a true
sharing miss.
 
Search WWH ::




Custom Search