Hardware Reference
In-Depth Information
using it completes execution, and a ROB entry is freed up when the corresponding instruction
commits.
With register renaming, deallocating registers is more complex, since before we free up a
physical register, we must know that it no longer corresponds to an architectural register and
that no further uses of the physical register are outstanding. A physical register corresponds
to an architectural register until the architectural register is rewriten, causing the renaming
table to point elsewhere. That is, if no renaming entry points to a particular physical register,
then it no longer corresponds to an architectural register. There may, however, still be uses
of the physical register outstanding. The processor can determine whether this is the case by
examining the source register specifiers of all instructions in the functional unit queues. If a
given physical register does not appear as a source and it is not designated as an architectural
register, it may be reclaimed and reallocated.
Alternatively, the processor can simply wait until another instruction that writes the same
architectural register commits. At that point, there can be no further uses of the older value
outstanding. Although this method may tie up a physical register slightly longer than neces-
sary, it is easy to implement and is used in most recent superscalars.
One question you may be asking is how do we ever know which registers are the architec-
tural registers if they are constantly changing? Most of the time when the program is execut-
ing, it does not mater. There are clearly cases, however, where another process, such as the
operating system, must be able to know exactly where the contents of a certain architectur-
all register reside. To understand how this capability is provided, assume the processor does
not issue instructions for some period of time. Eventually all instructions in the pipeline will
commit, and the mapping between the architecturally visible registers and physical registers
will become stable. At that point, a subset of the physical registers contains the architecturally
visible registers, and the value of any physical register not associated with an architectural re-
gister is unneeded. It is then easy to move the architectural registers to a fixed subset of phys-
ical registers so that the values can be communicated to another process.
Both register renaming and reorder buffers continue to be used in high-end processors,
which now feature the ability to have as many as 40 or 50 instructions (including loads and
stores waiting on the cache) in flight. Whether renaming or a reorder buffer is used, the key
complexity botleneck for a dynamically schedule superscalar remains issuing bundles of in-
structions with dependences within the bundle. In particular, dependent instructions in an is-
sue bundle must be issued with the assigned virtual registers of the instructions on which they
depend. A strategy for instruction issue with register renaming similar to that used for mul-
tiple issue with reorder buffers (see page 198) can be deployed, as follows:
1. The issue logic pre-reserves enough physical registers for the entire issue bundle (say, four
registers for a four-instruction bundle with at most one register result per instruction).
2. The issue logic determines what dependences exist within the bundle. If a dependence
does not exist within the bundle, the register renaming structure is used to determine the
physical register that holds, or will hold, the result on which instruction depends. When
no dependence exists within the bundle the result is from an earlier issue bundle, and the
register renaming table will have the correct register number.
3. If an instruction depends on an instruction that is earlier in the bundle, then the pre-re-
served physical register in which the result will be placed is used to update the information
for the issuing instruction.
Note that just as in the reorder buffer case, the issue logic must both determine dependences
within the bundle and update the renaming tables in a single clock, and, as before, the com-
Search WWH ::




Custom Search