Testing for Availability - Expert Oracle RAC Performance Diagnostics and Tuning

Database Reference

In-Depth Information

Chapter 3

Testing for Availability

RAC is a clustered database solution that provides two major functions, scalability and availability to business

continuum. We discuss testing for scalability in Chapter 4.

Availability is the ability of the system to provide continuous service when one or more the components in the

cluster fail. There are several components outside of the RAC software and database that are part of the cluster and are

prone to failures. In this chapter, we discuss the basic failure points and the best practices that are to be followed to

avoid such failures. Subsequently, we discuss the testing the hardware and application for availability.

Points of Failure (Gaps)

All application systems, including database systems, a can fail. The reasons for these failures range from natural

disasters to human mistakes. Although most of these failures are beyond human control, it's important to consider

why these failures occur.

RAC is a high availability solution. There are several points for potential failure in a RAC hardware configuration

such as the interconnect, which is the primary backbone in a RAC configuration. Because RAC is comprised of several

instances of Oracle, some of these failure scenarios could be found in the traditional stand-alone configuration,

whereas others are specific to RAC.

Figure 3-1 illustrates the various areas of the system (O/S, hardware, and Oracle components) that could fail. The

various failure scenarios in a six-node configuration as illustrated in Figure 3-1 are

1.

Interconnect failure

2.

Node failure

3.

Instance failure

4.

Media Failure

5.

Oracle component failure

Search WWH ::

Custom Search

Home