History of Virtualization and Architectural Evolution - Oracle Solaris 10 System Virtualization Essentials

Information Technology Reference

In-Depth Information

The algorithms used to implement virtual memory can work paradoxically in a

virtual machine environment. Approximations of the LRU replacement algorithm

are used in most virtual memory systems. LRU is based on the expectation that

a page that has not been used recently is likely not to be used in the near future;

hence it is the best one to “replace” by writing its contents to disk so its RAM can

be used for an active working set page. Most operating systems proactively write

LRU pages to disk so that pages will be available when an application has a page

fault and needs to have data paged in. However, when a guest selects its oldest

untouched page for replacement, the act of doing so references the page from the

point of the hypervisor, making it the most recently used (MRU) page. That has

the effect of making the guest's actively used pages older in terms of actual mem-

ory access than pages that the guest hasn't used in a long time! Depending on the

relative page pressure within the guest OS and among the hypervisor's various

guests, a nested LRU system could wind up evicting the very pages it should re-

tain. This behavior has, in fact, been observed under pathological circumstances.

VM/370 ran in extremely memory-constrained environments; thus its develop-

ers designed a scheduler that ran only those virtual machines whose working sets

fit in memory. With this approach, other guests were held in an “eligible” queue

and occupied no RAM at all. As delayed applications aged, they eventually reached

a priority high enough to evict one of the running applications. Essentially, ap-

plications in a memory-constrained system took their turns running, but when

they did run they had enough RAM for their working sets, and so ran efficiently.

A “handshaking” mechanism let the guest and host environments communi-

cate with each other. The VM hypervisor could tell the guest that it had incurred

a page fault, so the guest could dispatch a different user process rather than being

placed into page wait. This approach proved useful for guests running multiple

processes, though it could have the undesired effect of inflating working sets, and

it was unproductive if there was no other runnable process for the guest to run.

Also, a hypercall API let a guest tell the hypervisor that it no longer needed a block

of memory, and that pages in real memory and the swap area backing it could be

released. When these resources were freed, it reduced guest working-set size and

prevented double paging. If the guest subsequently referred to those pages, the

hypervisor gave it new page frames with zeroed contents rather than retrieving

the discarded contents from a swap file.

These methods anticipated cooperative memory management and ballooning

techniques available with VMware and Xen, in which the hypervisor can cause the

guest OS to use a smaller working set during times of high load on real memory.

Typically, a guest daemon allocates a memory buffer and maps it into the guest's

RAM. When memory pressure in the hypervisor is low, the memory buffer shrinks,

so the guest has more pages it can allocate to its applications. When memory pres-

sure in the hypervisor is high, the daemon is told to increase the size of the buffer

Search WWH ::

Custom Search

Home