Hardware Reference
In-Depth Information
MOD
R/M
00
01
10
11
000 M[EAX]
M[EAX + OFFSET8] M[EAX + OFFSET32] EAX or AL
001 M[ECX]
M[ECX + OFFSET8] M[ECX + OFFSET32] ECX or CL
010 M[EDX]
M[EDX + OFFSET8] M[EDX + OFFSET32] EDX or DL
011 M[EBX]
M[EBX + OFFSET8] M[EBX + OFFSET32] EBX or BL
100
SIB
SIB with OFFSET8
SIB with OFFSET32
ESP or AH
101
Direct
M[EBP + OFFSET8] M[EBP + OFFSET32] EBP or CH
110 M[ESI]
M[ESI + OFFSET8]
M[ESI + OFFSET32]
ESI or DH
111 M[EDI]
M[EDI + OFFSET8] M[EDI + OFFSET32]
EDI or BH
Figure 5-26. The Core i7 32-bit addressingmodes. M[ x ] is the memory word at x .
The MOD = 11 column gives a choice of two registers. For word instructions,
the first choice is used; for byte instructions, the second choice. Observe that the
table is not entirely regular. For example, there is no way to indirect through EBP
and no way to offset from ESP .
In some modes, an additional byte, called SIB ( Scale, Index, Base ), follows
the MODE byte (see Fig. 5-13). The SIB byte specifies a scale factor as well as two
registers. When a SIB byte is present, the operand address is computed by multi-
plying the index register by 1, 2, 4, or 8 (depending on SCALE ), adding it to the
base register, and finally possibly adding an 8- or 32-bit displacement, depending
on MOD . Almost all the registers can be used as either index or base.
The SIB modes are useful for accessing array elements. For example, consider
the Java statement
for(i=0;i<n;i++) a[i] = 0;
where a is an array of 4-byte integers local to the current procedure. Typically,
EBP is used to point to the base of the stack frame containing the local variables
and arrays, as shown in Fig. 5-27. The compiler might keep i in EAX . To access
a [ i ], it would use an SIB mode whose operand address was the sum of 4
×
EAX ,
EBP , and 8. This instruction could store into a [ i ] in a single instruction.
Is this mode worth the trouble? It is hard to say. Undoubtedly this instruction,
when properly used, saves a few cycles. How often it is used depends on the com-
piler and the application. The problem is that this instruction occupies a certain
amount of chip area that could have been used in a different way had this instruc-
tion not been present. For example, the level 1 cache could have been made big-
ger, or the chip could have been made smaller, possibly allowing a slightly higher
clock speed.
These are the kinds of trade-offs that designers face constantly. Usually, exten-
sive simulation runs are made before casting anything in silicon, but these simula-
tions require having a good idea of what the workload is like. It is a safe bet that
 
Search WWH ::




Custom Search