Information Technology Reference
In-Depth Information
overhead. An appropriate checkpoint for rollback or replay is selected using informa-
tion from the data race table. Further analysis could include detailed debugging to find
out the exact memory location and instructions responsible for the data race. Informa-
tion from the data race table and checkpoints could also be used to modify the thread
scheduling to avoid the occurrence of data race conditions in re-execution.
Signature Generator. GUARD's only hardware addition, the SG, performs three key
tasks: (i) extracting load/store information from committed instructions through an ex-
traction logic; (ii) compressing the memory access traces into signatures using Bloom
Filters; and (iii) forwarding signatures to the signature table. The extraction logic mon-
itors the CPU application for load and store instructions and extracts the addresses
accessed by these instructions. These load/store addresses are then compressed into
respective RD/WR signatures using Bloom Filters.
The potential speed difference between the CPU application and the GUARD kernel
means that the CPU could retire instructions faster than GUARD's ability to process
them. This could lead to GUARD missing some instructions and consequently missing
data race conditions. To avoid this, we design the SG on a feedback-based architecture
where the CPU retire stage and the SG communicate through special registers. When
GUARD is enabled, the CPU retire stage checks the SG state through the special regis-
ters and if SG is stalled, CPU pipeline is stalled to avoid missing any races. We evaluate
the impact of this design on the performance of the CPU application in Section 5.1.
Signature Selection. Signatures are long bit vector structures used to encapsulate ad-
dresses in the memory access trace in a compressed form. Figure 3(a) shows the signa-
ture creation process used for a signature of size 2048-bits, divided into eight bins. The
64-bit address in the extracted memory access instruction is divided into three sections
and these sections are passed through 8 different H3 hash functions, h1 through h8 ,and
a particular bit is set in each of the signature bins. Two signatures indicate a potential
data race only when all the eight bins have at least one common bit set. Here, we ana-
lyze the effect of various signature parameters on its false positive rate. A false positive
is defined as an incorrectly flagged data race condition due to two separate addresses
mapping to the same signature bits. We observe that the false positive rates are 18.78%,
37.88%, and 89.86% for 2048-bit, 1024-bit, and 512-bit signatures respectively. Use of
hardware signatures in data race detection has been explored by previous works [5, 6]
and the false positive rates we observe are similar to the rates observed by them.
This false positive data is based on epochs that could contain up to 2000 individual
instructions. Higher instruction count inside an epoch will lead to higher false positive
rate for signatures of the same length. Ideally, an epoch is closed by a synchronization
instruction. However, if there are no synchronization instructions within 2000 instruc-
tions, we forcibly close the epoch and write the signature to the signature table. This is
a practical design choice as data race conditions between memory accesses that execute
close to each other in time are the most critical, while those which occur far apart in
time are potentially benign data races.
Search WWH ::




Custom Search