Database Reference
In-Depth Information
Achieving the Benefits of Oracle RAC
In the last few sections we have examined the architecture of Oracle RAC and its two major components: Oracle
Clusterware and Oracle RAC Database. In this section, we discuss how Oracle RAC technology achieves HA and
scalability of Oracle Database.
High AvailabilityAgainst Unplanned Downtime
The Oracle RAC solution prevents unplanned downtime of the database service due to server hardware failure or
software failure. In the Oracle RAC environment, Oracle Clusterware and Oracle RAC work together to allow the
Oracle Database to run across multiple clustered servers. In the event of a database instance failure, no matter
whether the failure is caused by server hardware failure or an OS or Oracle Database software crash, this clusterware
provides the high availabilityand redundancy to protect the database service by failing over the user connections on
the failed instance to other database instances.
Both Oracle Clusterware and Oracle RAC contribute to this high availability database configuration. Oracle
Clusterware includes the High Availability (HA) service stack which provides the infrastructure to manage the Oracle
Database as a resource in the cluster environment. With this service, Oracle Clusterware is responsible for restarting
the database resource every time a database instance fails or after a RAC node restarts. In the Oracle RAC Database
environment, the Oracle Database along with other resources such as the virtual IP (VIP) are managed and protected
by Oracle Clusterware. In case of a node failure, Oracle Clusterware fails over these resources such as VIP to the
surviving nodes so that applications can detect the node failure quickly without waiting for a TCP/IP timeout. Then,
the application sessions can be failed over to the surviving nodes with connection pool and Transparent Application
Failover (TAF).
If a database instance fails while a session on the instance is in the middle of a DML operation such as inserting,
updating, or deleting, the DML transaction will be rolled back and the session will be reconnected to a surviving node.
The DML of the transaction would then need to be started over. Another great feature of the clusterware is the Oracle
Notification Services (ONS). ONS is responsible for publishing the Up and Down events on which the Oracle Fast
Application Notification (FAN) and Fast Connect Failover (FCF) rely to provide users with fast connection failover to
the surviving instance during a database instance failure.
Oracle RAC database software is cluster-aware. It allows Oracle RAC instances to detect an instance failure.
Once an instance failure is detected, the RAC instances communicate with each other and reconfigure the cluster
accordingly. The instance failure event triggers the reconfiguration of instance resources. During the instances'
startup, these instance resources were distributed across all the instances using a hashing algorithm. When an
instance is lost, the reconfiguration reassigns the new master instance for those resources that used the failed instance
as the master instance. This reconfiguration ensures that the RAC cache fusion survives the instance failure. The
reconfiguration is also needed when an instance rejoins the cluster once the failed server is back online, as this allows
further redistribution of the mastership with the newly joined instance. But this reconfiguration process that occurs
when adding a new instance takes less work than the one that occurs with a leaving instance, as when an instance is
leaving the cluster, those suspected resources need to be replayed and the masterships need to be re-established.
DRM is different from reconfiguration. DRM is a feature of Global Cluster Service that changes the master
instance of a resource based on resource affinity. When the instance is running on an affinity-based configuration,
DRM remasters the resource to another instance if the resource is accessed more often from another node. Therefore,
DRM occurs when the instance has a higher affinity to some resources than to others, whereas reconfiguration occurs
when an instance leaves or joins the cluster.
In the Oracle 12c Flex Cluster configuration, a Leaf node connects to the cluster through a Hub node. The failure
of the Hub Node or the failure of network between the Hub node and the Leaf nodes results in the node eviction of the
associated Leaf nodes. In Oracle RAC 12cR1, since there is no user database session connecting to any Leaf Nodes,
the failure of a Leaf Node will not directly cause user connection failure. The failure of the Hub Node is handled in
essentially the same way as the failover mechanism of a cluster node in 11gR2.
 
Search WWH ::




Custom Search