Database Reference
In-Depth Information
BSP does not suggest simultaneous accesses to the same memory location, hence,
precludes the requirement for a synchronization mechanism other than barriers.
Another primary concern in a distributed setting is to allocate data in a way that
computation will not be slowed down by non-uniform memory access latencies or
uneven loads among individual tasks. BSP promotes uniform access latencies via
enforcing local data accesses. In particular, data are communicated across super-
steps before triggering actual task computations. As such, BSP carefully segregates
computation from communication. Such a segregation entails that no particular net-
work topology is favored beyond the requirement that high throughput is delivered.
Butterfly, hypercube, and optical crossbar topologies can all be employed with BSP.
With respect to task loads, data can still vary across tasks within a super-step. This
typically depends on: (1) the responsibilities that the distributed program imposes on
its constituent tasks, and (2) the characteristics of the underlying cluster nodes (more
on this in Section 1.5.1). As a consequence, tasks that are lightly loaded (or are run
on fast machines) will potentially finish earlier than tasks that are heavily loaded
(or are run on slow machines). Subsequently, the time required to finish a super-step
becomes bound by the slowest task in the super-step (i.e., a super-step cannot commit
before the slowest task commits). This presents a major challenge for the BSP model
as it might create load imbalance, which usually degrades performance. Finally, it is
worth noting that while BSP suggests several design choices, it does not make their
use obligatory. Indeed, BSP leaves many design choices open (e.g., barrier-based
synchronization can be implemented at a finer granularity or completely switched
off-if it is acceptable by the given application).
1.5.4 D ata P arallel anD g raPh P arallel C omPutations
As distributed programs can be constructed using either the shared-memory or the
message-passing programming models as well as specified as being synchronous
or asynchronous, they can be tailored for different parallelism types. Specifically,
distributed programs can either incorporate data parallelism or graph parallelism .
Data parallelism is a form of parallelizing computation as a result of distributing
data across multiple machines and running (in parallel) corresponding tasks on those
machines. Tasks across machines may involve the same code or may be totally dif-
ferent. Nonetheless, in both cases, tasks will be applied to distinctive data. If tasks
involve the same code, we classify the distributed application as single program
multiple data (SPMD) application; otherwise, we label it as multiple program mul-
tiple data (MPMD) application. Clearly, the basic idea of data parallelism is simple;
by distributing, a large file across multiple machines, it becomes possible to access
and process different parts of the file in parallel. One popular technique for distrib-
uting data is file striping , by which a single file is partitioned and distributed across
multiple servers. Another form of data parallelism is to distribute whole files (i.e.,
without striping) across machines, especially if files are small and their contained
data exhibit very irregular structures. We note that data can be distributed among
tasks either explicitly using a message-passing model or implicitly using a shared-
memory model (assuming an underlying distributed system that offers a shared-
memory abstraction).
Search WWH ::




Custom Search