Hardware Reference
In-Depth Information
3. Perfect disambiguation of memory references done dynamically—this is ambitious but
perhaps atainable for small window sizes (and hence small issue rates and load/store buf-
fers) or through address aliasing prediction.
4. Register renaming with 64 additional integer and 64 additional FP registers, which is
slightly less than the most aggressive processor in 2011. The Intel Core i7 has 128 entries
in its reorder buffer, although they are not split between integer and FP, while the IBM
Power7 has almost 200. Note that we assume a pipeline latency of one cycle, which signiic-
antly reduces the need for reorder buffer entries. Both the Power7 and the i7 have latencies
of 10 cycles or greater.
Figure 3.27 shows the result for this configuration as we vary the window size. This conig-
uration is more complex and expensive than any existing implementations, especially in terms
of the number of instruction issues, which is more than 10 times larger than the largest num-
ber of issues available on any processor in 2011. Nonetheless, it gives a useful bound on what
future implementations might yield. The data in these figures are likely to be very optimistic
for another reason. There are no issue restrictions among the 64 instructions: They may all be
memory references. No one would even contemplate this capability in a processor in the near
future. Unfortunately, it is quite difficult to bound the performance of a processor with reas-
onable issue restrictions; not only is the space of possibilities quite large, but the existence of
issue restrictions requires that the parallelism be evaluated with an accurate instruction sched-
uler, making the cost of studying processors with large numbers of issues very expensive.
Search WWH ::




Custom Search