Database Reference
In-Depth Information
When a server or node crashes, all database components configured on that server or node are also prone for
failure. For example, in the previous discussions (Table 3-3 ), if Node prddb1 crashes because of the interface to the
storage subsystem fails, this will cause ASM on that server to fail, which will trigger the database instance on that
server to also fail. To take the real potential of the RAC's features, such failures should be made transparent to the user
and should minimize transaction loss.
Such interruptions can be avoided by adopting and implementing fast application notification (FAN) and/
or transparent application failover (TAF) functionality. OCW has been architected with a built-in functionality that
provides three levels of proactive failover and notification methods:
1.
The OCW will automatically fail over any services registered with it to another node or
instance based on the definitions in the OCR. Services and resources can be registered
with the OCW using Oracle Enterprise Manager (OEM) and srvctl.
2.
The OCW will use the Oracle notification services (ONS) to proactively notify the
participating client machines of any state changes by sending DOWN and UP FAN events. The
applications using Oracle call interface (OCI) calls interpret these events to proactively
react to these situations by sending/routing new connections to the new destinations.
3.
Using the policy managed configuration, rules can be defined across server pools. This
is done by maintaining minimum/maximum number of instances in a pool. When a
member in a pool fails and the pool is running short on the number of members required,
members from another pool are automatically provisioned (provided all the pool
management rules are met), and instances started, to support system availability and
throughput requirements.
A service is an abstraction layer of a single system image executed against the same database with common
functionality, quality expectations, and priority relative to other services. Examples of services could be payroll,
accounts payable, order entry, and so on.
TAF allows client applications to continue working after the application loses its connection to the database.
Although users may experience a brief pause during the time the database server fails over to a surviving cluster node,
the session context is preserved. If configured using TAF, after the instance failover and database recovery completes,
the application can automatically reconnect to one of the surviving instances and continue operations as if no failure
had occurred.
Note
fan, fCf, and taf are discussed in detail in Chapter 15.
RAP Phase II—Availability and Load Balancing
Once the various components of the cluster are found to be stable from RAP Phase I testing, the project can go as
planned for importing the database and data from the current production environment.
Once the database has been configured and the parameters set to match the current production, the next
phase of RAP testing should be planned. The goal of this test is to verify the application behavior when one, more,
or all instances crash within the cluster. How will the database tier provide business continuum when one or more
components of the database fail? Is the application able to handle such failures? What happens to the user workload:
did they notice the failure? What happens when there are media failures and the application is not able to retrieve or
persist data into the database? All these questions are answered by business requirements. Phase III validates if these
requirements are met.
 
 
Search WWH ::




Custom Search