Database Reference
In-Depth Information
data. Obviously, this has less effect on performance but implies that the correctness/
validity of the program must be assessed. In short, the distinction between synchro-
nous and asynchronous distributed programs refers to the presence or absence of
a (global) coordination mechanism that synchronizes the operations of tasks and
imposes a lock-step mode. As specific examples, MapReduce and Pregel programs
are synchronous, while GraphLab ones are asynchronous.
One synchronous model that is commonly employed for effectively implementing
distributed programs is the bulk synchronous parallel (BSP) model [74] (see Figure
1.8). The Pregel programs follow particularly the BSP model. BSP is defined as a com-
bination of three attributes, components , a router , and a synchronization method. A
component in BSP consists of a processor attached with data stored in local memory.
BSP, however, does not exclude other arrangements such as holding data in remote
memories. BSP is neutral about the number of processors, be it two or millions.
BSP programs can be written for v virtual distributed processors to run on p physi-
cal distributed processors, where v is larger than p . BSP is based on the message-
passing  programming model, whereby components can only communicate by
sending and receiving messages. This is achieved through a router which in principle
can only pass messages point to point between pairs of components (i.e., no broad-
casting facilities are available, though it can be implemented using multiple point-
to-point communications). Finally, as being a synchronous model, BSP splits every
computation into a sequence of steps called super-steps . In every super-step, S , each
component is assigned a task encompassing (local) computation. Besides, components
in super-step S are allowed to send messages to components in super-step S + 1 , and
are (implicitly) allowed to receive messages from components in super-step S − 1 .
Tasks within every super-step operate simultaneously and do not communicate with
each other. Tasks across super-steps move in a lock-step mode as suggested by any
synchronous model. Specifically, no task in super-step S + 1 is allowed to start before
every task in super-step S commits. To satisfy this condition, BSP applies a global
barrier-style synchronization mechanism as shown in Figure 1.8.
Iterations
Data
Data
Data
Data
CPU 1
CPU 1
CPU 1
Data
Data
Data
Data
Data
Data
Data
Data
CPU 2
CPU 2
CPU 2
Data
Data
Data
Data
Data
Data
Data
Data
Data
CPU 3
Data
CPU 3
CPU 3
Data
Data
Data
Data
Data
Data
Super-step 0
Super-step 1
Super-step 2
FIGURE 1.8
The bulk synchronous parallel (BSP) model.
Search WWH ::




Custom Search