Information Technology Reference
In-Depth Information
add 16 spare lines and we connect every 48 lines.Note that using spare lines for
tolerating yield loss is a common approach [5,22] and, the area overhead due to
extra lines is similar to this approach.
DVM can correct odd number of errors if they effect only one copy of the data
(Figure 1c). However, if the faults are in different copies of the data, DVM can
only detect the bit-positions of the faults without correcting them. Similarly, if
there is even number of faults in one partition DVM cannot correct them, either.
TVM (Figure 1d), on the other hand, can correct errors easily unless they are
affecting the bits in the same position (it has a significant possibility in very
high bit failure rate). Otherwise, after calculating the parity, TVM detects that
the result of majority voter is not correct, and it can correct errors if one of the
three copies is error-free. If all three copies are erroneous, and some errors are
in the same bit position, TVM can not correct the partition.
When there is an uncorrectable partition in a line, we utilize a partition-fix
mechanism in DVM and TVM to avoid wasting the correct partitions. Partition-
fix is similar to the bit-fix proposed by Wilkerson et al [30]. It uses a quarter of
the cache ways to store locations and the correct values of defective partitions.
This reduces both the cache size and associativity in the low-power mode. Thus,
we utilize partition-fix mechanism only for the lines which have uncorrectable
partitions. Note that, our partition-fix mechanism is different from the bit-fix for
a non-persistent bit failure correction. In bit-fix, the cache lines are not protected
by any other means, they only rely on memory tests and fixing the detected
failures. In Flexicache, the fixed partitions are also protected by DVM or TVM
which can still correct non-persistent failures. Previous triplication schemes [9,31]
write data to three cache lines and read the correct value from the majority voter.
In Flexicache, partitioning and parity protection of each partition present higher
error correction capability.
Persistent-fault tolerating proposals perform BIST [17] either postmanufac-
turing or at boot time to determine the uncorrectable cache lines at each volt-
age level [3,30,23]. These lines are stored in on-chip ROM or main memory and
loaded before the processor transitions into near-threshold. For non-persistent
failures, if the system can not correct a fault in L1 cache, either the correct value
is re-fetched from L2 cache if the write-through cache is utilized or the system
issues a machine check exception unless other means are utilized. Flexicache per-
forms BIST test as in previous proposals to determine faulty partitions in order
to fix them or disable the cache ways/lines including them. In runtime, Flex-
icache can detect and correct non-persistent failures, as well. For uncorrected
non-persistent failures, Flexicache can utilize lightweight, global checkpointing
such as SafetyNet [28].
4 Circuit Design
Conventional triplication schemes either write three lines sequentially [31] (harms
application performance) or increases the number of read/write ports [9] (in-
creases energy consumption). Previously, we designed dvSRAM which includes
Search WWH ::




Custom Search