Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

b. [15] <6.2, 6.3> Assume that there are 40 nodes per rack and that any remote read/write

has an equal chance of going to any node. What is the expected runtime at 100 nodes?

1000 nodes?

c. [10] <6.2, 6.3> An important consideration is minimizing data movement as much as

possible. Given the significant slowdown of going from local to rack to array accesses,

software must be strongly optimized to maximize locality. Assume that there are 40

nodes per rack, and 1000 nodes are used in the MapReduce job. What is the runtime if

remote accesses are within the same rack 20% of the time? 50% of the time? 80% of the

time?

d. [10] <6.2, 6.3> Given the simple MapReduce program in Section 6.2 , discuss some pos-

sible optimizations to maximize the locality of the workload.

6.15 [20/20/10/20/20/20] <6.2> WSC programmers often use data replication to overcome fail-

ures in the software. Hadoop HDFS, for example, employs three-way replication (one local

copy, one remote copy in the rack, and one remote copy in a separate rack), but it's worth

examining when such replication is needed.

a. [20] <6.2> A Hadoop World 2010 atendee survey showed that over half of the Hadoop

clusters had 10 nodes or less, with dataset sizes of 10 TB or less. Using the failure fre-

quency data in Figure 6.1 , what kind of availability does a 10-node Hadoop cluster

have with one-, two-, and three-way replications?

b. [20] <6.2> Assuming the failure data in Figure 6.1 and a 1000-node Hadoop cluster,

what kind of availability does it have with one-, two-, and three-way replications?

c. [10] <6.2> The relative overhead of replication varies with the amount of data writen

per local compute hour. Calculate the amount of extra I/O traffic and network traffic

(within and across rack) for a 1000-node Hadoop job that sorts 1 PB of data, where the

intermediate results for data shuling are writen to the HDFS.

d. [20] <6.2> Using Figure 6.6 , calculate the time overhead for two- and three-way rep-

lications. Using the failure rates shown in Figure 6.1 , compare the expected execution

times for no replication versus two- and three-way replications.

e. [20] <6.2> Now consider a database system applying replication on logs, assuming

each transaction on average accesses the hard disk once and generates 1 KB of log

data. Calculate the time overhead for two- and three-way replications. What if the

transaction is executed in-memory and takes 10 is

if [20] <6.2> Now consider a database system with ACID consistency that requires two

network round-trips for two-phase commitment. What is the time overhead for main-

taining consistency as well as replications?

6.16 [15/15/20/15/] <6.1, 6.2, 6.8> Although request-level parallelism allows many machines to

work on a single problem in parallel, thereby achieving greater overall performance, one

of the challenges is avoiding dividing the problem too finely. If we look at this problem in

the context of service level agreements (SLAs), using smaller problem sizes through great-

er partitioning can require increased effort to achieve the target SLA. Assume an SLA of

95% of queries respond at 0.5 sec or faster, and a parallel architecture similar to MapRe-

duce that can launch multiple redundant jobs to achieve the same result. For the following

questions, assume the query-response time curve shown in Figure 6.24 . The curve shows

the latency of response, based on the number of queries per second, for a baseline server as

well as a “small” server that uses a slower processor model.

Computer Architecture: A Quantitative Approach

Search WWH ::

Custom Search

Home