Database Reference
In-Depth Information
Table 6.4 Maximum capacity of PRCR nodes
10%
5%
2%
1%
λ RA (%)
5 × 10 13
2.3 × 10 14
2.8 × 10 15
99
NA
4.5 × 10 12
1.8 × 10 13
1.2 × 10 14
5 × 10 14
99.9
4.5 × 10 11
1.8 × 10 12
1.1 × 10 13
4.5 × 10 13
99.99
4.5 × 10 10
1.8 × 10 11
1.1 × 10 12
4.5 × 10 12
99.999
Therefore, the maximum capacities of PRCR nodes are presented. According to the
results shown in Table 6.2 , for illustration we choose 700 ns and 30 ms as the standard
execution times for the metadata scanning process and the proactive replica checking
task respectively. The micro Amazon EC2 instance (t1.micro) is chosen as the default
Cloud compute instance. Based on the standard execution times, the maximum capac-
ity of the PRCR nodes is calculated and presented. We calculate the maximum capac-
ity of PRCR nodes for storing data files with the data reliability requirements of 99%,
99.9%, 99.99% and 99.999% per year under different storage unit failure rates. In
Table 6.4 , the relat io nships among the reliability requirement, the average failure rate
of a single replica λ and the maximum capacity of PRCR nodes are clearly revealed.
With different failure rates of a single replica and reliability requirements, each PRCR
node is able to manage from 4.5 × 10 10 to 2.8 × 10 15 data files, which is quite large.
Although the maximum capacity of PRCR nodes reduces with the increment of disk
failure rate and data reliability requirement, the maximum capacity of PRCR nodes is
deemed big enough to be practical for the management of the huge number of Cloud
data files.
The total PRCR running cost is composed of the running cost for user interface,
PRCR nodes and Cloud compute instances for proactive replica checking. According
to the latest Amazon EC2 prices, the corresponding cost of an Amazon EC2 micro in-
stance takes only $14.40 ($0.02/h × 24 hours/day × 30 days/month) per month each.
Therefore, for complete PRCR running over AWS with one PRCR node, the running
cost is $43.20 ($14.40 × 3) per month. When divided by the maximum capacity of a
PRCR node, it can be seen that the running overhead for each data file is very small
(no more than $10 9 /data file per month according to Table 6.4 ) and can be negligible.
For example, the storage of a data file with the size of 1 GB has a PRCR running over-
head about 10 7 times cheaper than the storage cost (several cents/month).
6.5.2.2 Data storage cost
Next, the data storage cost using PRCR is investigated. We simulate the data reli-
ability management process of PRCR to manage the data files of the pulsar searching
example presented in Section 3.1 . In the simulation, the storage costs are compared
with the conventional three-replica strategy, which is widely used in current Clouds.
In the simulation, data files generated by the pulsar searching application men-
tioned in Section 3.1 are applied for illustration. In order to compare the storage using
PRCR and without using PRCR, three storage modes are provided. When PRCR is
Search WWH ::




Custom Search