Database Reference
In-Depth Information
TABLE 16.1
Qualitative Comparison of Different Emulation Techniques
Memory
Requirements
Start-Up
Performance
Steady-State
Performance
Code
Portability
Decode-and-dispatch
interpreter
Low
Fast
Slow
Good
Indirect-threaded
interpreter
Low
Fast
Slow
Good
Direct-threaded
interpreter
High
Slow
Medium
Medium
Binary translation
High
Very slow
Fast
Poor
instructions during the program execution and interpret new sections of code incre-
mentally as encountered by the program. This mechanism is denoted as dynamic
binary translation [27, 55].
To this end, Table 16.1 qualitatively compares binary translation, decode-and-
dispatch, indirect-threaded, and direct-threaded emulation techniques in terms of
four metrics, memory requirements, start-up performance, steady-state performance,
and code portability (a quantitative performance evaluation can be found in [52]).
To exemplify, the decode-and-dispatch interpreter row reads as follows. First, with
decode-and-dispatch, memory requirements remain low. This is because of having
only one interpreter routine per each instruction type in the target ISA. Alongside,
the decode-and-dispatch interpreter averts threading the dispatch code to the end of
each routine, thus inherently reduces the pressure on the memory capacity. Second,
start-up performance is fast because neither using intermediate forms nor caching
translated blocks are adopted. Third, steady-state performance (i.e., the performance
after starting up the interpreter) is slow because of (1) the high number of branches
and (2) the interpretation of every instruction upon every appearance. Finally, code
portability is good since saving addresses of interpreter routines (as is the case with
direct-threaded interpreters) and caching ISA-dependent translated binary code are
totally avoided.
16.6.4 u niProCessor anD m ultiProCessor vm s
As described earlier in the chapter, a virtual CPU (vCPU) acts as a proxy to a physi-
cal CPU (pCPU). In other words, a vCPU is a representation of a pCPU to a guest
OS. A vCPU can be initiated within a VM and mapped to an underlying pCPU by
the hypervisor. In principle, a VM can have one or many vCPUs. For instance, a
VM in VMWare ESX 4 can have up to 8 vCPUs [61]. This is usually referred to
as the width of a VM. A VM with a width greater than 1 is denoted as Symmetric
Multiprocessing (SMP) VM. In contrary, a VM with a width equal to 1 is referred
to as Uniprocessor (UP) VM. Figure 16.16 demonstrates an SMP native system VM
with a width of 4 and a UP native system VM, both running on the same hardware.
 
Search WWH ::




Custom Search