Distributed Programming for the Cloud - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

protocols, respectively. We note, however, that recently, Hadoop has undergone a

major overhaul to address several inherent technical deficiencies, including the reli-

ability and availability of the JobTracker, among others. The outcome is a new ver-

sion referred to as Yet Another Resource Negotiator (YARN) [53]. To elaborate,

YARN still adopts a master/slave topology but with various enhancements. First,

the resource management module, which is responsible for task and job scheduling

as well as resource allocation, has been entirely detached from the master (or the

JobTracker in Hadoop's parlance) and defined as a separate entity entitled as resource

manager (RM). RM has been further sliced into two main components, the scheduler

(S) and the applications manager (AsM). Second, instead of having a single master

for all applications, which was the JobTracker, YARN has defined a master per appli-

cation, referred to as application master (AM). AMs can be distributed across cluster

nodes so as to avoid application SPOFs and potential performance degradations.

Finally, the slaves (or what is known in Hadoop as TaskTrackers) have remained

effectively the same but are now called Node Managers (NMs).

In a peer-to-peer organization, logic, control, and work are distributed evenly

among tasks. That is, all tasks are equal (i.e., they all have the same capability) and

no one is a boss. This makes peer-to-peer organizations symmetrical. Specifically,

each task can communicate directly with tasks around it, without having to contact

a master process (see Figure 1.12b). A master may be adopted, however, but only

for purposes like monitoring the system and/or injecting administrative commands.

In other words, as opposed to a master/slave organization, the presence of a master

in a peer-to-peer organization is not requisite for the peer tasks to function cor-

rectly. Moreover, although tasks communicate with one another, their work can be

totally independent and could even be unrelated. Peer-to-peer organizations elimi-

nate the potential for SPOF and bandwidth bottlenecks, thus typically exhibit good

scalability and robust fault-tolerance. In contrary, making decisions in peer-to-peer

organizations has to be carried out collectively using usually voting mechanisms.

This typically implies increased implementation complexity as well as more com-

munication overhead and latency, especially in large-scale systems such as the cloud.

As a specific example, GraphLab employs a peer-to-peer organization. Specifically,

when GraphLab is launched on a cluster, one instance of its engine is started on each

machine. All engine instances in GraphLab are symmetric. Moreover, they all com-

municate directly with each other using a customized asynchronous remote proce-

dure call (RPC) protocol over TCP/IP. The first triggered engine instance, however,

will have an additional responsibility of being a monitoring/master engine. The other

engine instances across machines will still work and communicate directly without

having to be coordinated by the master engine. Consequently, GraphLab satisfies the

criteria to be a peer-to-peer system.

1.6 MAIN CHALLENGES IN BUILDING CLOUD PROGRAMS

Designing and implementing a distributed program for the cloud involves more than

just sending and receiving messages and deciding upon the computational and archi-

tectural models. While all these are extremely important, they do not reflect the

whole story of developing programs for the cloud. In particular, there are various

Large Scale and Big Data: Processing and Management

Search WWH ::

Custom Search

Home