Parallel Computing in the Analysis of Gene Expression Relationships - Parallel Computing for Bioinformatics and Computational Biology

Biomedical Engineering Reference

In-Depth Information

For schema B,

K 1

K 2

−

j ( 2 p))

−

( j ( 2 p)

2 p

−

1 )

c B (i)

(11.6)

−

where

−

2 Kp)

≥

−

K 1 =

−

otherwise,

−

( 2 Kp

2 p

−

1 ))

≥

−

K 2 =

−

otherwise,

2 p

In Eq. (11.6), the first summation is for forward distribution and the second for back-

ward distribution in each round. The workload imbalance is defined as the difference

between the amount of work on the processor with maximum workload and on the

processor with the minimum workload. The workload imbalance was calculated as a

function of the total number of features for 2, 4, 8, 16, 32, and 64 processors using

schemata A and B. The results of this calculation are shown in Figure 11.5. In the

figure, the plain lines represent the work imbalance calculated using schema A. The

lines with dots represent the work imbalance calculated using schema B. There is a

several orders of magnitude difference in the work imbalance between the schemata.

Using schema B, the imbalance increases with the number of processors. This simply

occurs because the correction that is introduced by reversing the direction of work

assignment is much smaller when there are many processors than when there are only

a few. As the differences between schemata B and C are small compared with the

differences between A and B, only schemata A and B are shown. For simplicity, we

selected schema B for implementation.

We implemented a parallel version of the σ -classifier using schema B and tested

this implementation with a lymphoma data set [26] consisting of 30 samples with

2303 genes. The measured run times and efficiencies for 1, 2, 4, 8, 16, and 32 proces-

sors are shown in Table 11.3 with k

2303. The schema B partitioning

worked well as the implementation scaled to 32 processors with a high efficiency.

The efficiency exceeds 100% due to slight differences between the parallel and serial

versions of the software. The serial version was designed to allow the user a selec-

tion of several options. The parallel version was optimized to run full searches only;

therefore, several conditional statements were removed from the main loop of the par-

allel version leading to a significant performance improvement. The parallel version

does not run on a single processor. The adjusted efficiency shown in the table is the

efficiency calculated assuming a serial time of double the two-processor time. When

compared with the two-processor timing, the efficiency remains at over 95% even for

32 processors.

3 and n

Parallel Computing for Bioinformatics and Computational Biology

Search WWH ::

Custom Search

Home