Flexicache: Highly Reliable and Low Power Cache under Supply Voltage Scaling - High Performance Computing

Information Technology Reference

In-Depth Information

(a) Block Diagram of a Bank

(b) Layout of a sub-bank

Fig. 2. The figure shows the block diagram of a bank in a 64KB, 4-way Flexicache and

the layout of a sub-bank and address decoder of Flexicache

two values in each cell, primary value and secondary value [27]. These two val-

ues can be accessed, modified, moved back and forth between the main and

secondary cells within the access time of the cache. Ergin et al. [16] also pro-

posed similar work using a shadow cell SRAM design for checkpointed register

files. Similarly, Flexicache needs to access replicated data within the cache ac-

cess time with minimum energy. Armejach et al [6] present how a reconfigurable

cache using dvSRAM circuits can be designed so that it can dynamically switch

its configuration between a 64KB general purpose data cache and a 32KB special

purpose, dual version using data cache. Flexicache also requires a reconfigurable

cache design so that it can provide three different execution modes (i.e. SVM,

DVM, TVM) not to sacrifice the cache capacity in the high-performance execu-

tion mode.

In this section, we elaborate how we can design the circuit of Flexicache for L1

data cache so that it can replicate cache lines without increasing access latency

and with minimal energy overhead. Note that it is straightforward to extend the

design for the instruction cache and the L2 cache. Felxicache can also be designed

orthogonally to dvSRAM so that it can support both optimistic concurrency and

near-threshold voltage execution that we leave it out of the scope of this study.

In this section we present the design of Flexicache for 4-way, 64-KB data cache

with 64-byte cache lines, and two clock cycle access time. Figure 2a presents the

block diagram of one of 4 ways. We use Cacti [29] to determine the optimal

number and size of Flexicache components (e.g. number of sub-banks) and the

cache architecture with optimal access time and power consumption. For a one-

bank array, Cacti suggests 2 identical sub-banks, 1 mat for each sub-bank and 4

sub-arrays in each mat (Figure 2a). We utilize these high-level CACTI results as

inputs to subsequent cache circuit design steps: we construct for one way Hspice

transistor level netlist using 45-nm Predictive Technology Model [2]. During an

access, only one of the two sub-banks (i.e. left sub-bank and right sub-bank) and

four identical sub-arrays of the mat (i.e. each sub-array holds a part of the cache

line) are activated. The address decoder and control signal generator units are

placed in the middle part of the array. Necessary data and address wires and

drivers are placed in the middle part of each sub-bank. Flexicache divides each

Search WWH ::

Custom Search

Home