Databases Reference
In-Depth Information
Performance disk statuses by way of its built-in monitoring. Exadata Storage Servers monitor disk drives and collect
information such as temperate, read/write errors, speed, and performance.
If a disk shows a predictive failure condition, it means that the server has experienced one or more read/write
error conditions, temperature threshold conditions, and so forth; this indicates that a disk failure could be imminent.
In this case, you should replace your disk using the same procedures outlined in the Solution of this recipe.
When a disk reports a poor performance condition, it should also be replaced using the steps provided in this
recipe. Each Exadata cell disk should exhibit the same performance characteristics and if one is performing poorly
based on performance metrics collected by the storage server, it could impact you database performance adversely.
In the case of a physical disk failure, Oracle automatically changes the physicaldisk and lun statuses change
from normal to critical . It then drops the celldisk and each griddisk on the celldisk . When the grid disk or disks
are dropped, ASM will drop its corresponding grid disks using the FORCE option as displayed from the ASM instance's
alert log:
SQL> /* Exadata Auto Mgmt: Proactive DROP ASM Disk */
alter diskgroup RECO_CM01 drop
disk RECO_CD_05_CM01CEL01 force
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=3
Tue Jul 05 21:48:13 2011
NOTE: Attempting voting file refresh on diskgroup DBFS_DG
GMON updating for reconfiguration, group 3 at 28 for pid 35, osid 12377
NOTE: group 3 PST updated.
NOTE: membership refresh pending for group 3/0x833f0667 (RECO_CM01)
WARNING: Disk 35 (_DROPPED_0035_RECO_CM01) in group 3 will be dropped in: (12960) secs on ASM inst 1
GMON querying group 3 at 29 for pid 19, osid 11535
SUCCESS: refreshed membership for 3/0x833f0667 (RECO_CM01)
SUCCESS: /* Exadata Auto Mgmt: Proactive DROP ASM Disk */
alter diskgroup RECO_CM01 drop
disk RECO_CD_05_CM01CEL01 force
When a physical disk enters a predictive failure state, the physicaldisk and lun statuses change from
normal to predictive failure . After this, the celldisk and each griddisk on the celldisk are dropped. When this
happens, the ASM disks are dropped without the FORCE option.
When the failed disk is replaced, the following things occur:
The firmware on the new disk is updated to reflect the same firmware version as the other
disks in the cell.
The cell disk is recreated to match the disk it replaced.
The replacement
celldisk is brought online and its status is set to normal .
Each
griddisk on the celldisk is onlined and has its status marked active .
The grid disks will automatically be added to Oracle ASM, resynchronized, and brought online
in its ASM disk group.
If you look in your ASM instance's alert log after replacing a failed disk, you will see messages similar to the
following ones. These are examples of Exadata's automatic disk management capability:
SQL> /* Exadata Auto Mgmt: ADD ASM Disk in given FAILGROUP */
alter diskgroup DATA_CM01 add
failgroup CM01CEL01
disk 'o/192.168.10.3/DATA_CD_05_cm01cel01'
 
Search WWH ::




Custom Search