Information Technology Reference
In-Depth Information
a specific type of central controller for data allocation to effectively skew the
workload, but this technique cannot be directly applied to large-scale storage
systems owing to a scalability issue. The objective of this paper is to address
these issues and explore a power-saving method that is adaptable to a typical
environment of Internet hosting services, i.e., constant massive influx of data
and changes in data popularity.
Our method is based on the idea behind MAID and PDC systems, which is
the migration of frequently accessed data to a subset of the disks. However, to
enhance scalability, our method periodically exchanges data among disks in an
autonomous way such that frequently accessed disks tend to gather frequently
accessed data from neighboring disks up to their capacity, and the opposite
occurs for rarely accessed disks so as to extend their time in low-power mode.
In this paper, we also consider several types of restrictions on the exchange of
data and evaluate their effect on performance in terms of power consumption,
response time, and data migration cost.
To evaluate the effectiveness of our method in a more realistic situation, we
measured the performance both in simulation and prototype implementation
using a real access pattern of 20,000 public photos uploaded to Flickr, which are
observable outside the website. In the experiments, we observed that our method
resulted in energy savings of 14.5-39.7%, while the overall average response time
was 133 ms. According to our experimental results, our method effectively skewed
the workload even if the data migration was conducted autonomously. On the
other hand, accesses of data stored on disks in low-power mode totaled 6.8-
19.1% of all accesses, and these accesses required extra time for the spinning-up
of disks. As a major factor of this problem, we observed that the number of
accesses rapidly decreased after one week from the upload for most of the files,
and such infrequent accesses were evenly distributed. Thus, in our method, it
was dicult to gather such unpopular data on some specific disks completely.
This results in a trade-off between the idleness threshold (i.e., the period for
never-accessed disks to spin down) and response time.
This paper is organized as follows. Section 2 presents relatedwork.Section3
describes the design of the proposed storage system. Section 4 gives the results
of preliminary investigations of access patterns of public uploaded photos on
Flickr,whichareusedinbothsimulationsandinexperimentswithaprototype.
Sections 5 and 6 present the simulation results and the evaluation of the proto-
type implementation. Finally, Section 7 concludes the paper and presents future
work.
2 Related Work
There have been a number of studies on reducing storage power consumption.
A common feature of many of the techniques proposed in the literature is that
they skew the workload, and they can be classified into the following categories
according to variations in their approach.
The first category, which includes MAID [1] and PDC [5], focuses on the
popularity and concentrates popular data on specific disks. The second category,
 
Search WWH ::




Custom Search