New Features in RAC 12c - Expert Oracle RAC 12c

Database Reference

In-Depth Information

for a 500-node standard cluster is 124,750, as we discussed in the beginning this chapter. The number of storage

network connections is also reduced from 500 to 25.

This significant reduction of network connections allows us to further scale out the cluster. With this hub-and-

spoke topology, the Flex Cluster in Oracle 12cR1 is designed to scale up 64 Hub nodes and many more Leaf nodes.

The Flex Cluster architecture helps to maintain the availability and reliability of the cluster even when the cluster

is scaled out to a very large number of nodes. This is achieved by having the OCR and voting disk accessible only to

Hub nodes and not Leaf nodes. For example, we will get the following error messages if we query the voting disks or

OCR access from a Leaf node:

$ crsctl query css votedisk

CRS-1668: operation is not allowed on a Leaf node

$ ocrcheck

PROT-605: The 'ocrcheck' command is not supported from a Leaf node.

As shown in the previous 500-node example, in Flex Clusters there are only a small number of Hub nodes and

the majority of the cluster nodes are Leaf nodes. By allowing only the Hub nodes to access the OCR and voting disks,

scaling out a Flex Cluster will not significantly increase resource contention for OCR and voting disks. As a result, the

chance of node eviction caused by contention for OCR and voting disk will not increase as the cluster is scaled out.

Like a standard cluster, an Oracle Flex Cluster is built with high-availability design. If a Hub node fails, this node

will be evicted from the cluster in the same way as a node in a standard cluster. The services on the failed node will

be failed over to other surviving Hub node in the cluster. The Leaf nodes that were connected to the failed Hub node

can be reconnected to another surviving Hub node within a grace period. The private interconnect heartbeat between

two Hub nodes are the same as the private interconnect heartbeat in the standard cluster. You can check the heartbeat

misscount setting between Hub nodes using the following crsctl command:

$crsctl get css misscount

CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.

If a Leaf node fails, this node will be evicted from the cluster. The services running on the failed Leaf node are

failed over to other Leaf nodes that are connected the same Hub node. This failover mechanism keeps the failover

within the group of Leaf nodes that are connected to the same Hub node. In this way, the other part of the cluster

nodes will be not impacted by this Leaf node's failure.

The network heartbeat is used to maintain network connectivity between a Leaf node and the Hub node to which

the Leaf node connects. Similar to the private interconnect heartbeat between the Hub nodes, the maximal threshold

time that this heartbeat is tolerable is defined by the leafmisscount setting, which by default is 30 seconds. If the

heartbeat failure passes this leafmisscount setting, then the Leaf node either will be reconnected to the other Hub

node or will be evicted from the cluster. You can query this setting by running this command:

$ crsctl get css leafmisscount

CRS-4678: Successful get leafmisscount 30 for Cluster Synchronization Services

You can also manually reset this setting by running this command on a Hub node.

$ crsctl set css leafmisscount 40

CRS-4684: Successful set of parameter leafmisscount to 40 for Cluster Synchronization Services.

You cannot reset this setting from a Leaf node:

$ crsctl set css leafmisscount 40

CRS-1668: operation is not allowed on a Leaf node

Search WWH ::

Custom Search

Home