Biomedical Engineering Reference
In-Depth Information
on computational resources and the perception that biological systems seems to be
comprised small, function-centered regulatory sub-networks, a small set of genes
was chosen based on the predictive relationships obtained from CoD analysis using
the parallel method described earlier and available biological knowledge. The focus
of the study was the steady-state behavior of the Markov chain constructed from
the multivariate relationships and the transition rules, both estimated from the data.
This modeling would not have been possible without availability of the parallel CoD
analysis.
When developing parallel methods for gene expression analysis, researchers are
presented with the opportunity provided by the availability of new parallel computer
architectures. At the level of the individual processor, taking advantage of parallelism
among instructions is critical to achieving high performance. Modern superscalar
processors provide this instruction-level parallelism by pipelining the instruction
execution cycle in addition to issuing multiple instructions per clock cycle. How-
ever, providing parallelism at the system level with multiple superscalar processors is
necessary to reduce the time needed to complete the required calculations. The calcu-
lations are divided into parts that are independently executed on different processors.
As mentioned earlier, the speedup that can be achieved by having p processors work
concurrently on the calculations is at most p times faster than a single processor.
Although attempts are made to achieve this ideal speedup, the ability to attain it, in
practice, is determined by the efficiency of the developed parallel method to exploit
the natural concurrency for the computing problem.
Modern vector architectures can offer advantages over superscalar processors for
a range of compute-intense applications. Vector processors provide high-level oper-
ations that work on vectors — linear arrays of numbers. With computer architects
developing vector microprocessors for multimedia applications, it will be possible to
implement multiple processor systems where each processor contains a number of
vector processors in addition to a superscalar processor core. For example, Kozyrakis
and Patterson [28] are developing the CODE (Clustered Organization for Decoupled
Execution) architecture, a vector processor that contains a number of functional units
with a unique clustered vector register file.
A block diagram of the CODE processor is shown in Figure 11.6. In addition to the
scalar core, the processor has N clusters. A cluster contains an integer, floating point,
Instructions
D cache
Scalar core
I cache
Vector instructions
Cluster
Local clustered vector
register file
Vector issue
logic
Memory
system
Instruction bus
Input
VFU
Output
Cluster 1
Cluster 2
Cluster 3
Cluster N
. . .
Communication network
Data
Figure 11.6
Block diagram of the CODE vector processor.
Search WWH ::




Custom Search