Databases Reference
In-Depth Information
For this discussion, we will define a few probability definitions:
P(surv) = The probability of survival in the event of single drive failure. In other words, the
probability of not losing access to data
n = The number of disks that comprise a failure set. With Oracle ASM on systems with
“many disks,” you will have eight (8) partner disks for each mirror, whether the disk group is
configured with normal or high redundancy—the difference between the two is the number of
mirrors, not the number of partner disks.
• l = Rate of failure of a disk drive. This is the inverse of the drive's published Mean Time
Between Failure (MTBF).
Trepair = Time to repair a failed drive
Our formulas for measuring probability of survival, which is a measure of risk of data loss, can be expressed as
the following:
ASM Normal Redundancy:
P(surv) = exp(-n* l *Trepair)
P(surv) = (1+n* l *Trepair) * exp(-n* l *Trepair)
If we consider independent disk drive failures and use a 1,000,000-hour failure rate and a 24-hour time to repair a
failed disk, our probability of survival with ASM normal redundancy is the following:
ASM High Redundancy:
P(surv) = exp(-n* l *Trepair)
= exp(-8 * (1/1000000) * 24)
= 99.98%
With ASM high redundancy, our survival probability:
P(surv) = (1+n l Trepair) * exp(-n l Trepair)
= (1+8*(1/1000000)*24) ( exp(-8 * (1/1000000) * 24)
= 99.99%
If you now consider a potential accelerated failure rate for disk drives, which often is a more realistic scenario
considering environmental reasons for failure, let's see what our probabilities of survival look like when our MTBF is
once per month. In the example below, considering a failure rate of once per month:
With ASM normal redundancy:
P(surv) = exp(-n* l *Trepair)
= exp(-8 * (1/720) * 24)
= 76.59%
With ASM high redundancy:
P(surv) = (1+n* l *Trepair) * exp(-n* l *Trepair)
= (1+8*(1/720)*24) ( exp(-8 * (1/720) * 24)
= 97.01%
As you can see, accelerate failure rates yield much lower survival probabilities than independent failure rates.
Furthermore, ASM disk groups configured with high redundancy offer much better protection in an accelerated
failure rate scenario as compared to normal redundancy ASM disk groups.
 
Search WWH ::




Custom Search