Virtualizing Resources for the Cloud - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

TABLE 16.1

Qualitative Comparison of Different Emulation Techniques

Memory

Requirements

Start-Up

Performance

Steady-State

Performance

Code

Portability

Decode-and-dispatch

interpreter

Low

Fast

Slow

Good

Indirect-threaded

interpreter

Low

Fast

Slow

Good

Direct-threaded

interpreter

High

Slow

Medium

Binary translation

High

Very slow

Fast

Poor

instructions during the program execution and interpret new sections of code incre-

mentally as encountered by the program. This mechanism is denoted as dynamic

binary translation [27, 55].

To this end, Table 16.1 qualitatively compares binary translation, decode-and-

dispatch, indirect-threaded, and direct-threaded emulation techniques in terms of

four metrics, memory requirements, start-up performance, steady-state performance,

and code portability (a quantitative performance evaluation can be found in [52]).

To exemplify, the decode-and-dispatch interpreter row reads as follows. First, with

decode-and-dispatch, memory requirements remain low. This is because of having

only one interpreter routine per each instruction type in the target ISA. Alongside,

the decode-and-dispatch interpreter averts threading the dispatch code to the end of

each routine, thus inherently reduces the pressure on the memory capacity. Second,

start-up performance is fast because neither using intermediate forms nor caching

translated blocks are adopted. Third, steady-state performance (i.e., the performance

after starting up the interpreter) is slow because of (1) the high number of branches

and (2) the interpretation of every instruction upon every appearance. Finally, code

portability is good since saving addresses of interpreter routines (as is the case with

direct-threaded interpreters) and caching ISA-dependent translated binary code are

totally avoided.

16.6.4 u niProCessor anD m ultiProCessor vm s

As described earlier in the chapter, a virtual CPU (vCPU) acts as a proxy to a physi-

cal CPU (pCPU). In other words, a vCPU is a representation of a pCPU to a guest

OS. A vCPU can be initiated within a VM and mapped to an underlying pCPU by

the hypervisor. In principle, a VM can have one or many vCPUs. For instance, a

VM in VMWare ESX 4 can have up to 8 vCPUs [61]. This is usually referred to

as the width of a VM. A VM with a width greater than 1 is denoted as Symmetric

Multiprocessing (SMP) VM. In contrary, a VM with a width equal to 1 is referred

to as Uniprocessor (UP) VM. Figure 16.16 demonstrates an SMP native system VM

with a width of 4 and a UP native system VM, both running on the same hardware.

Large Scale and Big Data: Processing and Management

Search WWH ::

Custom Search

Home