Backup and Recovery - Oracle Exadata Recipes: A Problem-Solution Approach

Databases Reference

In-Depth Information

name DATA_CD_05_CM01CEL01

rebalance nowait

NOTE: Assigning number (1,33) to disk (o/192.168.10.3/DATA_CD_05_cm01cel01)

NOTE: requesting all-instance membership refresh for group=1

NOTE: initializing header on grp 1 disk DATA_CD_05_CM01CEL01

NOTE: requesting all-instance disk validation for group=1

If you manually inactivate grid disks, the ASM disks will be automatically offlined by ASM and you will have up to

the amount of time specified by the disk group attribute disk_repair_time to resolve the situation without incurring

an ASM disk group rebalance operation. Each ASM disk group has a disk_repair_time attribute, which defaults to

3.6 hours. This represents how long the Exadata DMA has to replace an inactivated grid disk before the ASM disk is

dropped from the disk group. If the threshold is crossed and the disk is dropped from the ASM disk group, ASM will

need to perform a disk group rebalance operation prior to bringing the disk online. In either case, the intervention

required by the Exadata DMA is typically minimal due to Exadata's automatic disk management modules.

■ If you experience a simultaneous loss of both disk drives in slots 0 and 1, your situation is a bit more

complicated since these disks are where your system area and system volume reside. please see recipe 8-9 for details

on how to recover from this scenario.

Note

How do you protect yourself against disk drive failure? You can't protect yourself entirely from a hard drive

crashing; after all, a disk drive is a piece of mechanics and electronics that will at some point fail. However, you

can (and should) protect against loss of data or loss of access to your data by implementing ASM normal or high

redundancy. With ASM redundancy on Exadata, you can afford to lose an entire storage server and all its disks (or any

number of disks in a storage cell) and still maintain access to your data.

Oracle does make life easier for you on Exadata by constantly monitoring your disks and reporting failure

conditions to its alert repository and, alternatively, by sending SNMP traps to monitoring software. To provide the

best level of monitoring coverage, you should configure Automated Service Requests, monitor your systems using

Enterprise Manager, or make it an operational practice to check your Exadata storage cell alerts.

8-9. Recovering Storage Cells from Loss of a System Volume

Using CELLBOOT Rescue

Problem

You have either corrupted your system volume images or suffered from simultaneous loss of the first two disk drives in

your Exadata storage cell, and you wish to use the internal CELLBOOT USB drive to recover from failure.

Solution

In this section, we will outline the steps necessary to invoke the storage cell rescue procedure to restore your system

volume. At a high level, these are the steps you should take:

•

Understand the scope of the failure

•

Contact Oracle Support and open a Service Request

•

Boot your system from the internal CELLBOOT USB image

Search WWH ::

Custom Search

Home