Database Reference
In-Depth Information
The OCR and voting disk volume is for the OCR and voting disk file. The OCR stores the metatada of the
resources that Oracle Clusterware manages, such as Oracle RAC Database, listeners, virtual IPs, and service. The
voting disk files are used to store and manage the node membership of the cluster. Oracle Clusterware processes
Cluster Ready Service (CRS), and Cluster Synchronization Services (CSS) on each RAC node constantly accesses
OCR and voting disks. OCR and voting disks need to be accessible from each of the RAC nodes all the time. If a RAC
node fails to access the voting disk file in 200 seconds, it will trigger a node eviction event that causes the RAC node to
reboot itself. Therefore, the critical requirement for OCR and voting disk is the availability and fault tolerance of the
storage volume.
To understand how the Oracle RAC Database accesses shared storage, let's review the Oracle RAC Database
architecture. Figure 5-1 shows a two-node RAC Database. Each of the RAC Database instances has a set of database
background processes, such as the log writers (LGWR), DB writers (DBWn) and server processes, etc., along with
RAC-specific processes such as LMS, LMON, LMD, etc. Each RAC instance also has its memory structure, including
the System Global Area (SGA) memory structure, where database buffer cache and redo log buffers are located. The
database buffer cache is the memory area that stores copies of data blocks read from data files. The redo log buffer is a
circular buffer in the SGA that stores redo entries describing changes made to the database.
When a user sends a query request to the database instance, a server process is spawned to query the database.
The block request is sent to the master instance of the block to check if this block has been read into any instance's
buffer cache. If the blocks cannot be found in any instance's buffer cache, the server process will have to get the block
from the storage I/O by reading the data block from the data files to the local buffer cache through data file read
operations. If the data block is found in the buffer cache of one or more RAC instances, the instance will ask the Global
Cache Service (GCS) which is the LMS process to get the latest copy of the block. If the latest copy is on a remote
instance, the copy will be shipped from the buffer cache of the remote instance to the local buffer cache. In this way,
Oracle cache fusion moves the current blocks between the RAC instances. As long as the block is in an instance's
buffer cache, all other instances can get the latest copy of the block from the buffer cache of an instance instead of
reading from the storage.
Different types of application workloads determine the way in which RAC Database instances interact with the
storage. For example, the data file read can be “random read” or “sequential read.” For online transaction processing
(OLTP) type database workloads, most queries involve small random reads on the data files by taking advantage of the
index scan. For data warehouse or decision support (DSS) workloads, the queries involve large sequential reads of the
data files due to large full-table scan operations. In order to achieve optimal I/O performance for the OLTP workload,
it is very critical to have a fast I/O operation, as measured by IOPS (I/O operations per second) and I/O latency. The
IOPS number is about the I/O throughput, namely, how many I/O operations can be performed per second, while I/O
latency is defined as the time which it takes to complete a single I/O operation.
One way to achieve higher IOPS is to have the data striped across multiple disk drives so that these multiple
disk drives can be read in parallel. Another more promising solution is to use Solid State Drives (SSD), which
can significantly increase IOPS and reduce I/O latency by removing the performance bottleneck created by the
mechanical parts of the traditional hard disk. For DSS workloads, it is important to be able to read a large amount of
data contiguously stored in the disk to the buffer cache at high speed (as defined by MBPS (megabytes/second)). The
bandwidth of the components linking the server with the storage such as HBAs (Host Bus Adapters), storage network
protocol, physical connection fabrics, and the storage controllers is the key to this type of performance. In reality,
many database applications fix these two types of workloads. When we look for storage for the database, its IOPS,
MBPS, and I/O latency should be evaluated to ensure that they meet the database I/O requirements.
Besides reading data from storage, writing data to storage is also critical to RAC Database performance. This is
especially true for the OLTP-type database workload. Two important disk write operations are as follows:
writing redo logs from the redo log buffer in SGA to the online redo log files in the storage by
the logwriter process (LGWR).
writing modified blocks (dirty blocks) in the buffer cache to the data files by the DB writer
process (DBWn).
 
Search WWH ::




Custom Search