Information Technology Reference
In-Depth Information
The algorithms used to implement virtual memory can work paradoxically in a
virtual machine environment. Approximations of the LRU replacement algorithm
are used in most virtual memory systems. LRU is based on the expectation that
a page that has not been used recently is likely not to be used in the near future;
hence it is the best one to “replace” by writing its contents to disk so its RAM can
be used for an active working set page. Most operating systems proactively write
LRU pages to disk so that pages will be available when an application has a page
fault and needs to have data paged in. However, when a guest selects its oldest
untouched page for replacement, the act of doing so references the page from the
point of the hypervisor, making it the most recently used (MRU) page. That has
the effect of making the guest's actively used pages older in terms of actual mem-
ory access than pages that the guest hasn't used in a long time! Depending on the
relative page pressure within the guest OS and among the hypervisor's various
guests, a nested LRU system could wind up evicting the very pages it should re-
tain. This behavior has, in fact, been observed under pathological circumstances.
VM/370 ran in extremely memory-constrained environments; thus its develop-
ers designed a scheduler that ran only those virtual machines whose working sets
fit in memory. With this approach, other guests were held in an “eligible” queue
and occupied no RAM at all. As delayed applications aged, they eventually reached
a priority high enough to evict one of the running applications. Essentially, ap-
plications in a memory-constrained system took their turns running, but when
they did run they had enough RAM for their working sets, and so ran efficiently.
A “handshaking” mechanism let the guest and host environments communi-
cate with each other. The VM hypervisor could tell the guest that it had incurred
a page fault, so the guest could dispatch a different user process rather than being
placed into page wait. This approach proved useful for guests running multiple
processes, though it could have the undesired effect of inflating working sets, and
it was unproductive if there was no other runnable process for the guest to run.
Also, a hypercall API let a guest tell the hypervisor that it no longer needed a block
of memory, and that pages in real memory and the swap area backing it could be
released. When these resources were freed, it reduced guest working-set size and
prevented double paging. If the guest subsequently referred to those pages, the
hypervisor gave it new page frames with zeroed contents rather than retrieving
the discarded contents from a swap file.
These methods anticipated cooperative memory management and ballooning
techniques available with VMware and Xen, in which the hypervisor can cause the
guest OS to use a smaller working set during times of high load on real memory.
Typically, a guest daemon allocates a memory buffer and maps it into the guest's
RAM. When memory pressure in the hypervisor is low, the memory buffer shrinks,
so the guest has more pages it can allocate to its applications. When memory pres-
sure in the hypervisor is high, the daemon is told to increase the size of the buffer
 
Search WWH ::




Custom Search