Power-Aware Autonomous Distributed Storage Systems for Internet Hosting Service Platforms - Cloud Computing

Information Technology Reference

In-Depth Information

then it moves to Group B and a new empty disk is supplied to Group A from

the disk pool.

In Group B, to skew the workload, each disk periodically (e.g., once every

hour) exchanges part of its stored data one-for-one following four steps.

Step 1: The disk (say, D 1 ) with the smallest ID randomly chooses a disk (say,

D 2 ).

Step 2: D 1 and D 2 exchange information of their current workloads.

Step 3: If D 1 is more popular than D 2 , the least frequently accessed file (say,

f L )on D 1 is exchanged for the most frequently accessed file (say, f H )on

D 2 , and this process is repeated while the frequency of access of f L is less

than that of f H .

Step 4: D 1 issues an instruction to the next disk.

When implementing this process, the access frequency of popular disks gradually

increases, until the capacity of the disks is exceeded, by gathering popular data

from the neighboring disks, and the opposite process is carried out for unpopular

disks so as to extend their time spent in low-power mode. During this exchange

process, if a disk eventually becomes empty, it is removed from Group B and

queued in the Empty disk pool for the supply of disks to Group A.

Here, we may consider restrictions on the exchange to reduce migration cost,

such as limits on the number of exchangeable files during any one exchange, the

number of migrations for each file, and the scope of exchangeable disks. In this

study, we evaluated the effects of restrictions on the energy performance of disks,

which are described in later sections.

The underlying lookup service used by clients to access data is managed by a

distributed hash table, such as Chord [7]. However, for adaption to our system

in which data are frequently moved among disks, we use pointers in a list of

key-value pairs to indicate the current disk on which data are stored. That is, if

afile f 1 stored on disk D 1 migrates to D 2 ,thekeyof f 1 is associated with the

destination of migration (i.e., D 2 )in D 1 's list of key-value pairs. In addition,

to avoid multiple hops from pointer to pointer, each file is assumed to have

information of the ID of the disk originally storing the file as metadata, and if

f 1 originally stored on D 1 and moved to D 2 is moved further to D 3 , the pointer

of D 1 is rewritten so as to directly indicate the current position.

4 Data Access Tracing in Flickr

To evaluate the effectiveness of our method in a realistic situation, as a prelim-

inary study, we traced access patterns of photo data uploaded to Flickr, which

is one of the largest photo sharing services in the world.

In this preliminary study, we randomly selected 20,000 photos and traced the

cumulative number of accesses for each file every hour over two weeks with APIs

provided by the website. Owing to limitations of accessible observable data, all

the selected photos are public, although the website has supposedly around four

Search WWH ::

Custom Search

Home