Hardware Reference
In-Depth Information
Table 3.1 Input parameters for the benchmark problem
Category
Parameter
Description
Values
Out of order execution
rob_depth
Reorder buffer depth
32, 48, 64, 80, 96, 112, 128
mreg_cnt
Rename register number
16, 32, 48, 64
iw_width
Instruction window width
4, 8, 16, 24, 32
Cache system
icache_size
Instruction cache size
16, 32, 64
dcache_size
Data cache size
16, 32, 64
scache_size
Secondary cache size
0, 256, 512, 1024
lq_size
Load queue size
16, 24, 32
sq_size
Store queue size
16, 24, 32
mshr_size
Miss holding register size
4, 8
Branch prediction
bht_size
Branch history table size
512, 1024, 2048, 4096
btb_size
Branch target buffer size
16, 32, 64, 128
Table 3.2 Output parameters for the benchmark problem
Category
Metric
Description
Performance
total_cycle
Total cycle number
total_instr
Total instruction number
IPC
Instruction per cycle
Power dissipation
total_energy
Total energy consumed
total_dissipation
Average power dissipation
peak_power_dissipation
Peak power dissipation
Area occupation
Area
Area occupied
the algorithms with a reference Pareto front, which should be the real Pareto front of
the optimization problem, or at least a good approximation of it. The design space of
the problem outlined above consists of 1,161,216 designs. Since the time required to
evaluate all these designs is too large to be considered as an option, a statistical study
of the configuration parameters was performed in order to try to identify parameters
which could be fixed to a constant value to reduce the size of the design space without
a significant reduction of the problem interest for SoC designers. A statistical study
was performed with random exploration using Multicube Explorer, exploring a set
of 5,000 randomly selected designs.
In order to reduce the size of the design space, the following parameters with low
contribution where identified: rob depth, lq size, sq size and mshr size. Adequate
constant values were selected for them, reducing the size of the design space to only
9,216 designs. The reduced problem was considered valid both from the point of
view of SoC designers and from the point of view of the mathematical properties of
the design space and its associated Pareto front. All designs in the reduced design
space were evaluated by performing a full factorial multi-level exploration obtaining
the real Pareto front in a few days of execution time. This Pareto front consists of 18
points. Figure 3.5 shows if and when the considered algorithms discover them.
The performance measures selected for the algorithms comparison concern both
time and quality. Since by far the most time consuming component of the optimization
procedure is the simulator execution, its number of evaluations has been selected as a
fair measure of the required algorithm execution time. Concerning the quality of the
 
Search WWH ::




Custom Search