Storage and ASM Practices - Expert Oracle RAC 12c

Database Reference

In-Depth Information

The OCR and voting disk volume is for the OCR and voting disk file. The OCR stores the metatada of the

resources that Oracle Clusterware manages, such as Oracle RAC Database, listeners, virtual IPs, and service. The

voting disk files are used to store and manage the node membership of the cluster. Oracle Clusterware processes

Cluster Ready Service (CRS), and Cluster Synchronization Services (CSS) on each RAC node constantly accesses

OCR and voting disks. OCR and voting disks need to be accessible from each of the RAC nodes all the time. If a RAC

node fails to access the voting disk file in 200 seconds, it will trigger a node eviction event that causes the RAC node to

reboot itself. Therefore, the critical requirement for OCR and voting disk is the availability and fault tolerance of the

storage volume.

To understand how the Oracle RAC Database accesses shared storage, let's review the Oracle RAC Database

architecture. Figure 5-1 shows a two-node RAC Database. Each of the RAC Database instances has a set of database

background processes, such as the log writers (LGWR), DB writers (DBWn) and server processes, etc., along with

RAC-specific processes such as LMS, LMON, LMD, etc. Each RAC instance also has its memory structure, including

the System Global Area (SGA) memory structure, where database buffer cache and redo log buffers are located. The

database buffer cache is the memory area that stores copies of data blocks read from data files. The redo log buffer is a

circular buffer in the SGA that stores redo entries describing changes made to the database.

When a user sends a query request to the database instance, a server process is spawned to query the database.

The block request is sent to the master instance of the block to check if this block has been read into any instance's

buffer cache. If the blocks cannot be found in any instance's buffer cache, the server process will have to get the block

from the storage I/O by reading the data block from the data files to the local buffer cache through data file read

operations. If the data block is found in the buffer cache of one or more RAC instances, the instance will ask the Global

Cache Service (GCS) which is the LMS process to get the latest copy of the block. If the latest copy is on a remote

instance, the copy will be shipped from the buffer cache of the remote instance to the local buffer cache. In this way,

Oracle cache fusion moves the current blocks between the RAC instances. As long as the block is in an instance's

buffer cache, all other instances can get the latest copy of the block from the buffer cache of an instance instead of

reading from the storage.

Different types of application workloads determine the way in which RAC Database instances interact with the

storage. For example, the data file read can be “random read” or “sequential read.” For online transaction processing

(OLTP) type database workloads, most queries involve small random reads on the data files by taking advantage of the

index scan. For data warehouse or decision support (DSS) workloads, the queries involve large sequential reads of the

data files due to large full-table scan operations. In order to achieve optimal I/O performance for the OLTP workload,

it is very critical to have a fast I/O operation, as measured by IOPS (I/O operations per second) and I/O latency. The

IOPS number is about the I/O throughput, namely, how many I/O operations can be performed per second, while I/O

latency is defined as the time which it takes to complete a single I/O operation.

One way to achieve higher IOPS is to have the data striped across multiple disk drives so that these multiple

disk drives can be read in parallel. Another more promising solution is to use Solid State Drives (SSD), which

can significantly increase IOPS and reduce I/O latency by removing the performance bottleneck created by the

mechanical parts of the traditional hard disk. For DSS workloads, it is important to be able to read a large amount of

data contiguously stored in the disk to the buffer cache at high speed (as defined by MBPS (megabytes/second)). The

bandwidth of the components linking the server with the storage such as HBAs (Host Bus Adapters), storage network

protocol, physical connection fabrics, and the storage controllers is the key to this type of performance. In reality,

many database applications fix these two types of workloads. When we look for storage for the database, its IOPS,

MBPS, and I/O latency should be evaluated to ensure that they meet the database I/O requirements.

Besides reading data from storage, writing data to storage is also critical to RAC Database performance. This is

especially true for the OLTP-type database workload. Two important disk write operations are as follows:

•

writing redo logs from the redo log buffer in SGA to the online redo log files in the storage by

the logwriter process (LGWR).

•

writing modified blocks (dirty blocks) in the buffer cache to the data files by the DB writer

process (DBWn).

Expert Oracle RAC 12c

Search WWH ::

Custom Search

Home