Database Reference
In-Depth Information
problem arises from the random inserts to the four indexes; every new index row
may cause a disk read and a disk write, 80 random reads and 80 random writes
per second in total (4 × 20).
Let us first ignore both the read cache and the write cache. In this worst-case
scenario, the drive load caused by the four indexes is
80
240%
If the four indexes are striped over 112 drives, their contribution to the average
drive load is 240%/112 = 2%. This value should be compared against the level
at which a drive is considered to be overloaded. Ideally, the average drive load
should be as low as 25%. Then, according to queuing theory, the average drive
queuing time would be 3 ms. Thus adding 2% to the average drive busy is
probably tolerable but not insignificant.
How much of this drive load would be eliminated by the read cache and the
write cache?
The read cache, 64 GB, seems large compared to the total size of the four
indexes (1.2 GB), but if the access pattern is truly random, the time between
references to any one of the 75,000 leaf pages in each index is not far from the
average, which is
× (
6ms
+
24 ms
) =
2400 ms
/
s
=
2
.
4
=
1h
when the insert rate is 20 rows per second. If the average read cache residency
time is 30 min, not many disk drive reads will be saved by the read cache. If
the write cache holding time is shorter, 10 min, for instance, its effect will be
even less significant: a write to drive would be saved only if a leaf page is
updated more than once in 10 min. The actual average drive busy caused by the
four indexes may thus be almost 2%. The caches bring much bigger drive load
savings if the access pattern is not random. Leaf pages that are updated more
than ten times per hour could stay in both caches for a long time.
RAID 10, mirroring and striping but no parity bits, would reduce the drive
busy per modified page from 24 to 12 ms, but the number of drives would
increase. The 256 drives with RAID 5 would have roughly the same effect.
A rule of thumb can be derived from this example. Indicator L predicts the
contribution of the indexes on a table to the average drive load with RAID5:
L = N × I/D
75
,
000
×
50 ms
=
3750 s
=
where N
the number of indexes with random inserts
I = insert rate (table rows per second)
D = the number of disk drives (excluding spares)
If L < 1, the increase in drive load is not an issue; it is probably less than 2%.
If L is between 1 and 10, the increase in drive load may be noticeable.
If L
=
10, drive load is likely to be a problem, unless cache hit ratios
are high.
In the example above,
>
L
=
4
×
20
/
112
=
0
.
7
Search WWH ::




Custom Search