Database Reference
In-Depth Information
High Availability Considerations
Oracle RAC provides HA of the database service by reducing unplanned and planned downtime caused by server
failure. But RAC itself doesn't protect the database against other failures, such as storage failure, data corruption,
network failure, human operation error, or even data center failure. To provide complete protection against these
failures, additional measures need to be taken. Oracle MAA (Maximal Availability Architecture) lists the guidelines
and related Oracle technologies needed to protect databases against those failures.
During the deployment of RAC, it is critical to follow HA practices to ensure the stability of the RAC. The most
important hardware components that Oracle RAC relies on are the private network and shared storage. The private
network should be based on a redundant network with two dedicated switches. Chapter 9 discusses the RAC network
in detail. The shared storage access should be based on multiple I/O paths, and the storage disk drives should be set
up with a RAID configuration and Oracle ASM disk mirroring to ensure redundancy. Chapter 5 discusses storage best
practices in detail.
In theory, Oracle RAC protects the database service against failure of up to N-1 servers (where N is the total
number of servers). In reality, if all of the N-1 servers fail, the workloads of the entire clusterware will be on the
only surviving node, and the performance will definitely suffer unless each server leaves N-1/N headroom. For
example, for a four-node RAC, leaving 3/4 (75%) headroom would not be realistic. A realistic approach is to
ensure that each server in the cluster can handle the failed-over workload in case of single server failure. This
requires each server to leave only 1/N headroom. And the bigger N is, the less headroom is needed. The worst
case is a two-node RAC, where each server needs to reserve 1/2 (50%) headroom. For a four-node RAC,
only 1/4 = 25% headroom is needed.
CpU headroom is the CpU resource that we have to leave unused in case of server failure. the less headroom,
the better resource utilization on each node.
Note
Scalability Considerations
Oracle RAC provides database scalability. With the addition of each extra RAC node, the cluster is expected to
increase database performance capability: handling larger workloads or more concurrent users, performing more TPS
(transactions per second) for OLTP, or reducing the average transaction/query response time. However, many RAC
databases may not show linear scalability when adding more RAC nodes. This is because there are many other factors
that are related to database scalability:
1.
Poor database design and poorly tuned SQL queries can lead to very costly query plans
that may kill database throughput and significantly increase query response time. Poorly
tuned queries will run just as badly (or even worse) in RAC compared to a single-node
database.
2.
There may be quite costly performance overhead caused by Oracle cache fusion, and
excessive wait time on data blocks transferring on interconnects between RAC nodes
during query executions and database transactions. These wait events are called cluster
wait events. Cache fusion overhead and cluster wait events may increase when multiple
RAC instances access the same data blocks more frequently. A higher number of RAC
nodes also contributes to cluster waits and slows down the interconnect. The number
of RAC nodes is limited by the bandwidth of the interconnect network, which is less of
an issue with the introduction of high-speed networks such as InfiniBand and 10-40GB
Ethernet.
 
 
Search WWH ::




Custom Search