Database Reference
In-Depth Information
Chapter 3
Testing for Availability
RAC is a clustered database solution that provides two major functions, scalability and availability to business
continuum. We discuss testing for scalability in Chapter 4.
Availability is the ability of the system to provide continuous service when one or more the components in the
cluster fail. There are several components outside of the RAC software and database that are part of the cluster and are
prone to failures. In this chapter, we discuss the basic failure points and the best practices that are to be followed to
avoid such failures. Subsequently, we discuss the testing the hardware and application for availability.
Points of Failure (Gaps)
All application systems, including database systems, a can fail. The reasons for these failures range from natural
disasters to human mistakes. Although most of these failures are beyond human control, it's important to consider
why these failures occur.
RAC is a high availability solution. There are several points for potential failure in a RAC hardware configuration
such as the interconnect, which is the primary backbone in a RAC configuration. Because RAC is comprised of several
instances of Oracle, some of these failure scenarios could be found in the traditional stand-alone configuration,
whereas others are specific to RAC.
Figure 3-1 illustrates the various areas of the system (O/S, hardware, and Oracle components) that could fail. The
various failure scenarios in a six-node configuration as illustrated in Figure 3-1 are
1.
Interconnect failure
2.
Node failure
3.
Instance failure
4.
Media Failure
5.
Oracle component failure
 
Search WWH ::




Custom Search