Information Technology Reference
In-Depth Information
At this point, you can still manually remedy this imbalance. Any business continuity plan in
a virtual environment built on vSphere should include a contingency plan that identii es VMs
to be powered off to make resources available for those VMs with higher priority because of the
network services they provide. If the budget allows, construct the vSphere HA cluster to ensure
that there are ample resources to cover the needs of the critical VMs, even in times of reduced
computing capacity. You can enforce guaranteed resource availability for restarting VMs by set-
ting Admission Control to Enabled, as described previously in the section “Coni guring vSphere
HA Admission Control.”
vSphere High Availability Isolation Response
Previously, we introduced FDM as the underpinning for vSphere HA and how it uses the ESXi
management network to communicate between the master host and all connected slave hosts.
When the vSphere HA master is no longer receiving status updates from a slave host, then the
master assumes that host has failed and instructs the other connected slave hosts to spring into
action to power on all the VMs that the missing node was running.
But what if the node with the missing heartbeat was not really missing? What if the heart-
beat was missing but the node was still running? This is the scenario described in the section
“Understanding vSphere HA's Underpinnings” when we discussed the idea of network isolation .
When an ESXi host in a vSphere HA-enabled cluster is isolated—that is, it cannot communicate
with the master host nor can it communicate with any other ESXi hosts or any other network
devices—then the ESXi host triggers the isolation response coni gured in the dialog box shown
in Figure 7.19. As you can see, for the entire cluster the default isolation response is Leave
Powered On. You can change this setting (generally not recommended) either for the entire clus-
ter here or for one or more specii c VMs in the VM Overrides section.
Because vSphere HA uses the ESXi management network as well as connected datastores
(via datastore heartbeating) to communicate, network isolation is handled a bit differently start-
ing with vSphere 5.0 than in previous versions of vSphere. In previous versions of vSphere,
when a host was isolated it would automatically trigger the coni gured isolation response. A
host considered itself isolated when it was not receiving heartbeats from any other hosts and
when it could not reach the isolation address (by default, the default gateway on the management
network).
From vSphere 5.0, the process for determining if a host is isolated is only slightly different.
A host that is the master is looking for communication from its slave hosts; a host that is run-
ning as a slave is looking for updates from the master host. In either case, if the master or slave
is not receiving any vSphere HA network heartbeat information, it will then attempt to contact
the isolation address (by default, the default gateway on the management network). If it can
reach the default gateway or an additional coni gured isolation address(es), then the ESXi host
considers itself to be in a network partition state and reacts as described in the section titled
“Understanding vSphere HA's Underpinnings.” If the host can't reach the isolation address, then
it considers itself isolated. Here is where this behavior diverges from the behavior of previous
versions.
At this point, an ESXi host that has determined it is network-isolated will modify a spe-
cial bit in the binary host-X-poweron i le on all datastores that are coni gured for datastore
heartbeating (more on that in the section titled “Setting vSphere High Availability Datastore
Heartbeating”). The master sees that this bit, used to denote isolation, has been set and is there-
fore notii ed that this slave host has been isolated. When a master sees that a slave has been
Search WWH ::




Custom Search