Information Technology Reference
In-Depth Information
These are the potential data race accesses and we call such addresses shared-modified .
Since the impact of ST instructions on accuracy is very low, we do not apply any fil-
tering on them. By filtering out innocuous LD instructions, we are able to bring down
the false positive rate for GUARD without any negative impact on performance or data
race detection capability.
We consider a memory hierarchy design with private L1 caches and a shared LLC
which is common in current multicore processors. When the filtering mechanism is
enabled, SG monitors the data response message from the LLC to check the shared-
modified state. If the data was written to by another thread and is in modified state in
the LLC, the shared-modified state is set by the LLC controller. When the state is set,
SG concludes that this is a potential data race candidate and adds the address to the RD
signature. Otherwise, the address is filtered out. The filtering mechanism considers the
following three scenarios:
- L1 Hit: When a LD instruction hits the L1 data cache, the data is either private
or shared read-only. Such an access will not cause a data race, and hence it is
considered safe and the address is filtered out.
- L1 Miss & LLC Hit: When a LD instruction misses L1 data cache and hits the
shared LLC, the LLC controller uses the coherence information to identify the state
of the address. If the address was in a modified state prior to the load request, it was
written to by another thread recently. Hence, this address is considered shared-
modified and the corresponding bit is set in the response message.
- LLC Miss: If the access misses the shared LLC, it is potentially a cold miss or an
access to the address after a long interval. Such accesses are considered safe as they
will not cause a data race. Hence, the LLC controller resets the shared-modified bit
and the address is filtered out.
These scenarios, however, could still experience a situation where the access could
lead to a data race condition. In a potential write-after-read (WAR) race condition sce-
nario, when the read instruction occurs at first, there is not enough information to make
a decision on filtering. However, a future write to the same memory location by another
thread in a concurrent epoch results in a potential data race. Hence, if this LD instruc-
tion was filtered out due to insufficient information, a potential WAR race condition
could be missed.
This issue can be addressed by using temporary hardware signatures. For every thread,
the filtered LD addresses from the current epoch are compressed and stored in tempo-
rary signatures. When a ST occurs (rather infrequent) in a thread, LLC controller sends
invalidation messages to sharers and the cache line is set to modified state. The SG in
these sharers compare the address in the invalidation message with the addresses in their
temporary RD signature, and if there is a match, the address is added back to the thread's
RD signature. However, only the addresses from the current epoch could be saved as the
previous epochs would have already been dispatched to the GPU for data race detection.
Also, limited capacity of the LLC or time gap between the two instructions could lead
to the related cache line being evicted from the shared LLC. The scheme will then fil-
ter out the LD instruction, due to lack of information in the LLC. However, it should be
emphasized here that the most crucial data race accesses are the ones that occur in close
proximity, and those are unlikely to be filtered out due to this limitation.
Search WWH ::




Custom Search