Database Reference
In-Depth Information
buffer cache size. You can verify whether the DEFAULT CACHE size is adequate or not. If the DEFAULT CACHE size is
too low, you can see the related error message in the alert.log.
Monitor and View Instance Recovery Details
Instance recovery operation details are logged in the alert logs as well as in the SMON trace file. Refer to the alert.
log file to find out the details, such as when the recovery began and when it was completed. For additional details on
recovery, you can also refer to the SMON trace file.
Furthermore, the ESTD_CLUSTER_AVAILABILE_TIME column in the GV$INSTANCE_RECOVERY dynamic view
shows the amount of time (in seconds) that the instance will be frozen. Therefore, the longer the time, the longer will
be the estimated instance recovery.
Instance/Crash Recovery Internals in RAC
To provide a better understanding of the way that an instance recovery is performed in a RAC database, we
constructed a small test scenario. As part of the test case, the following tasks were carried out on instance 1:
A new table created
A few records inserted and committed
All records deleted with 'delete * from table_name'
A couple of records inserted
Instance 1 was aborted from the other SQL window
The following recovery action is performed by a surviving instance SMON background process for the
dead instance:
1.
Surviving instance acquires the instance recovery enqueue.
2.
After Global Cache Services (GCS) are remastered, the SMON then reads the redo logs
of the failed instance to recognize the resources (data blocks) that are needed for the
recovery.
3.
Global Resource Directory will be frozen after acquiring all necessary resources.
4.
At this stage, all data blocks, except that needed for recovery, become accessible.
5.
SMON starts recovering the data blocks identified earlier.
6.
Immediately after the recovery, the individual data blocks become accessible.
7.
All uncommitted transactions are also rolled back to maintain the consistency.
8.
Once all the data blocks are recovered, the instance is fully available for end users.
Here is a walkthrough of the alert.log of the surviving instance that does the recovery for the failed instance.
Most of the action is recorded in the context of instance recovery; you can refer to the log file to understand how
things have been performed by the instance. All of the previously explained steps can be seen in the alert.log:
Reconfiguration started (old inc 12, new inc 14) <<<<<
List of instances:
2 (myinst: 2)
Global Resource Directory frozen <<<<<
* dead instance detected - domain 0 invalid = TRUE <<<<<
 
Search WWH ::




Custom Search