Biology Reference
In-Depth Information
according to their inherent difficulty, with particular attention to their behavior as the
size of the input grows ( scalability ). Therefore, the only way to improve scalability
is to develop a better algorithm for the problem at hand. This is often not possible,
either because we are already using an optimal algorithm or because the problem
at hand cannot be solved in an efficient way (i.e., its computational complexity is
more than polynomial in the size of the input in the worst case). Such problems are
known as NP-hard and are common in the theory of Bayesian networks, both for
structure learning ( Chickering 1996 ) and inference ( Cooper 1990 ).
On the other hand, the software architecture can make a significant difference
in the overall performance of the program. For instance, choosing the right data
structures for the problem can significantly improve or degrade the computational
complexity of an algorithm. Tailoring the implementation to the specific hardware
it will run on can also result in noticeable speedups.
Parallel computing, defined as the execution of several calculations
simultaneously, is an application of this last idea. It was originally introduced to
overcome hardware limitations in terms of computational power; large problems
were divided into smaller ones, which were then solved concurrently on multipro-
cessor supercomputers. Special-purpose hardware architectures were then devel-
oped to take advantage of such software, thus maximizing the impact of the raw
performance of the hardware for properly implemented programs. A classification
of these hardware architectures was proposed by Flynn ( 1972 ), according to the
nature of the data and the operations they support:
Single-Instruction, Single-Data (SISD): a single processing unit performing a
single operations on the same data
Multiple-Instruction, Single-Data (MISD): multiple processing units performing
different operations (independently and asynchronously) on the same data
Single-Instruction, Multiple-Data (SIMD): multiple processing units performing
the same operation on multiple data
Multiple-Instruction, Multiple-Data (MIMD): multiple processing units perform-
ing different operations on multiple data
Nearly all general-purpose modern computers are based on the MIMD model; a
notable exception are those that offload part of their workload to graphical process-
ing units (GPU), as the latter are based on a SIMD model. All modern processors
have more than one core, and each core supports multiple concurrent execution
threads. Furthermore, in the last few years the speed of processors has peaked be-
cause some of the physical constraints mentioned above are preventing further fre-
quency scaling. At the same time, the number of cores present in each processor
is still increasing, and the same holds for the number of threads supported by each
core. As a result, modern computers are remarkably similar to the MIMD parallel
architectures from the 1970s and the 1980s described by Flynn, albeit with vastly
different capabilities.
This evolution has sparked a renewed interest in parallel computing. Recent re-
search efforts focused on two major areas: the parallelization of existing algorithms
and the development of new algorithms and software libraries explicitly designed
Search WWH ::




Custom Search