Information Technology Reference
In-Depth Information
CPU
Data−parallel
Core
Signature
Generator
2
3
4
1
ICNT
Signature Table
4
1
2
3
sig
sig
CPU L2 Cache
n
01
race
race
GPU
L2 Cache
Data Race Table
Fig. 2. GUARD mechanism is based on a heterogeneous multicore processor with CPU cores
and Data-parallel accelerator (GPU) cores. Signature Generator , the only hardware modification
to the baseline processor, is highlighted.
When GUARD is enabled for data race detection, a library function is invoked. It cre-
ates two data structures, the signature table and the data race table , in the GPU memory
space. The SG is configured with the starting addresses of these tables. Henceforth, the
SG is able to write generated signatures to the signature table and read flagged data race
conditions from the data race table. It then launches the GPU kernel that performs the
happened-before algorithm. In GUARD, the GPU cores work in tandem with the CPU
cores to detect data races. We describe the CPU-side actions ( 1
and 4
in Figure 2)
and GPU-side actions ( 2
and 3
in Figure 2) in detail in the next two sections.
3.1
CPU-Side Actions
Memory trace generated by each CPU core is partitioned into chunks called epochs .
Synchronization instructions, such as lock/unlock, barriers, etc. define epochs. All the
addresses belonging to an epoch are encapsulated into representative signatures using
Bloom Filters and H3 hash functions [21]. For each epoch, the SG generates two sig-
natures: a read (RD) and a write (WR). Once the signatures are generated, they are
written to the signature table (action 1
in Figure 2) stored in the GPU memory space.
The signature table contains signatures from all CPUs, and forms the input to the H-B
algorithm running on the GPU. It is a circular queue structure where the oldest pro-
cessed entry for each processor is over-written by the latest entry. A flag is maintained
for each signature entry indicating whether the entry has been processed by the GPU or
not. The SG refers to this flag before the entry is over-written with a new signature, and
resets the flag when a new entry is written.
Once a data race is detected, the related information is written to the data race table
by the GPU kernel and a notification is sent to the CPU in the form of an exception.
An appropriate response such as rollback or replay is then initiated (action 4
in Fig-
ure 2). GUARD can utilize existing record/replay mechanisms [22] to perform this step.
Efficient checkpointing systems such as Revive [23] can create checkpoints with low
Search WWH ::




Custom Search