Overview of Oracle RAC - Expert Oracle RAC 12c

Database Reference

In-Depth Information

Achieving the Benefits of Oracle RAC

In the last few sections we have examined the architecture of Oracle RAC and its two major components: Oracle

Clusterware and Oracle RAC Database. In this section, we discuss how Oracle RAC technology achieves HA and

scalability of Oracle Database.

High AvailabilityAgainst Unplanned Downtime

The Oracle RAC solution prevents unplanned downtime of the database service due to server hardware failure or

software failure. In the Oracle RAC environment, Oracle Clusterware and Oracle RAC work together to allow the

Oracle Database to run across multiple clustered servers. In the event of a database instance failure, no matter

whether the failure is caused by server hardware failure or an OS or Oracle Database software crash, this clusterware

provides the high availabilityand redundancy to protect the database service by failing over the user connections on

the failed instance to other database instances.

Both Oracle Clusterware and Oracle RAC contribute to this high availability database configuration. Oracle

Clusterware includes the High Availability (HA) service stack which provides the infrastructure to manage the Oracle

Database as a resource in the cluster environment. With this service, Oracle Clusterware is responsible for restarting

the database resource every time a database instance fails or after a RAC node restarts. In the Oracle RAC Database

environment, the Oracle Database along with other resources such as the virtual IP (VIP) are managed and protected

by Oracle Clusterware. In case of a node failure, Oracle Clusterware fails over these resources such as VIP to the

surviving nodes so that applications can detect the node failure quickly without waiting for a TCP/IP timeout. Then,

the application sessions can be failed over to the surviving nodes with connection pool and Transparent Application

Failover (TAF).

If a database instance fails while a session on the instance is in the middle of a DML operation such as inserting,

updating, or deleting, the DML transaction will be rolled back and the session will be reconnected to a surviving node.

The DML of the transaction would then need to be started over. Another great feature of the clusterware is the Oracle

Notification Services (ONS). ONS is responsible for publishing the Up and Down events on which the Oracle Fast

Application Notification (FAN) and Fast Connect Failover (FCF) rely to provide users with fast connection failover to

the surviving instance during a database instance failure.

Oracle RAC database software is cluster-aware. It allows Oracle RAC instances to detect an instance failure.

Once an instance failure is detected, the RAC instances communicate with each other and reconfigure the cluster

accordingly. The instance failure event triggers the reconfiguration of instance resources. During the instances'

startup, these instance resources were distributed across all the instances using a hashing algorithm. When an

instance is lost, the reconfiguration reassigns the new master instance for those resources that used the failed instance

as the master instance. This reconfiguration ensures that the RAC cache fusion survives the instance failure. The

reconfiguration is also needed when an instance rejoins the cluster once the failed server is back online, as this allows

further redistribution of the mastership with the newly joined instance. But this reconfiguration process that occurs

when adding a new instance takes less work than the one that occurs with a leaving instance, as when an instance is

leaving the cluster, those suspected resources need to be replayed and the masterships need to be re-established.

DRM is different from reconfiguration. DRM is a feature of Global Cluster Service that changes the master

instance of a resource based on resource affinity. When the instance is running on an affinity-based configuration,

DRM remasters the resource to another instance if the resource is accessed more often from another node. Therefore,

DRM occurs when the instance has a higher affinity to some resources than to others, whereas reconfiguration occurs

when an instance leaves or joins the cluster.

In the Oracle 12c Flex Cluster configuration, a Leaf node connects to the cluster through a Hub node. The failure

of the Hub Node or the failure of network between the Hub node and the Leaf nodes results in the node eviction of the

associated Leaf nodes. In Oracle RAC 12cR1, since there is no user database session connecting to any Leaf Nodes,

the failure of a Leaf Node will not directly cause user connection failure. The failure of the Hub Node is handled in

essentially the same way as the failover mechanism of a cluster node in 11gR2.

Expert Oracle RAC 12c

Search WWH ::

Custom Search

Home