Database Reference
In-Depth Information
Oracle uses an algorithm called STONITH (Shoot The Other Node In The Head), which allows the healthy
nodes to kill the sick node by letting the sick node reboot itself. Since 11.2.0.2 with the introduction of reboot-less
node eviction, in some cases the node reboot may be avoided by just shutting down and restarting the Clusterware.
While Oracle Clusterware guarantees interconnect communication among the RAC nodes, Oracle RAC provides
coordination and synchronization and data exchanging between the RAC instances using the interconnect.
In the Oracle RAC environment, all the instances of a RAC database appear to access a common global buffer
cache where the query on each instance can get the up-to-date copy of a data block, also called the “master copy,”
even though the block has been recently updated by another RAC instance. This is called global cache coherency.
In this global cache, since resources such as data blocks are shared by the database process within a RAC instance
and across all RAC instances, coordination of access to the resources is needed across all instances. Coordination
of access to these resources within a RAC instance is done with latches and locks, which are the same as those in a
single-instance database. Oracle cache fusion technology is responsible for coordination and synchronization of
access to these shared resources between RAC instances to achieve global cache coherency:
1.
Access to shared resources between instances is coordinated and protected by the global
locks between the instances.
2.
Although the actual buffer cache of each instance still remains separate, each RAC
instance can get the master copy of the data block from another instance's cache by
transferring the data block from the other cache through the private interconnect.
Oracle Cache Fusion has gone through several major enhancements in various versions of Oracle Database.
Before the Cache Fusion technology was introduced in Oracle 8.1.5, the shared disk was used to synchronize the
updates—one instance needs to write the updated data block to the storage immediately after the block is updated in
the buffer cache so that the other instance can read the latest version of the data block from the shared disk.
In Oracle 8.1.5, Cache Fusion I was introduced to allow the Consistent Read version of the data block to be
transferred across the interconnect. Oracle 9i introduced Cache Fusion II to dramatically reduce latency for the
write-write operations. With Cache Fusion II, if instance A needs to update a data block which happens to be owned
by instance B, instance A requests the block through the Global Cache Service (GCS), instance B gets notification
from the GCS and releases the ownership of the block and sends the block to instance A through the interconnect.
This process avoids the disk write operation of instance B and disk read operation of instance A, which were
required prior to Oracle 9i. This was called a disk ping and was highly inefficient for this multiple instance's
write operation.
Since the introduction of Cache Fusion II , in Oracle RAC Database, coordination and synchronization between
the RAC database instances have been achieved by two RAC services: the Global Cache Service (GCS) and Global
Enqueue Service (GES) along with a central repository called the Global Resource Directory (GRD). These two
services are the integrated part of Oracle RAC, and they also rely on the clusterware and private interconnects for
communications between RAC instances. Both GES and GCS coordinate access to shared resources by RAC instances.
GES manages enqueue resources such as the global locks between the RAC instances, and the GCS controls global
access to data block resources to implement global cache coherency.
Let's look at how these three components work together to implement global cache coherency and coordination
of access to resources in the RAC across all the RAC instances.
In Oracle RAC, multiple database instances share access to resources such as data blocks in the buffer cache
and the enqueue. Access to these shared resources between RAC instances needs to be coordinated to avoid conflict.
In order to coordinate and manage shared access to these resources, information such as data block ID, which RAC
instance holds the current version of this data block, and the lock mode in which this data block is held by each
instance is recorded in a special place called the Global Resource Directory (GRD). This information is used and
maintained by GCS and GES for global cache coherency and coordination of access to resources such as data blocks
and locks.
The GRD tracks the mastership of the resources, and the contents of the GRD are distributed across all the RAC
instances, with the amount being equally divided across the RAC instances using a mod function when all the nodes
of the cluster are homogeneous. The RAC instance that holds the GRD entry for a resource is the master instance of
 
Search WWH ::




Custom Search