Applications of the SDL Global Optimization Method in DNA Microarray Data Analysis - DNA Microarray Technology and Data Analysis in Cancer Research

Biology Reference

In-Depth Information

Each of the four possible rows — {(1, 1), (1, 2), (2, 1), (2, 2)} — can be

seen here, and they all appear the same number of times (three times

here); this is the property that makes it an OA. Since only 1's and 2's

appear, this is called a two-level array. There are 11 columns, which

means that one can vary the levels of up to 11 different variables; and 12 rows,

which means that 12 different combinations of variables can be tested in

experiments. The aim is to investigate not only the effects of the individ-

ual variables on the outcome, but also how the variables interact.

Owen (1992 and 1994) and Loh (1996) describe some uses for ran-

domized OAs in numerical integration, computer experiments, and visu-

alization of functions. These references contain further references to the

literature, which in turn provide further explanations.

The OA used in this research is L 242 (11 23 ), which is too large to

be shown here. The OA L 242 (11 23 ) has 242 rows (observations or tests),

23 columns (factors or variables), and 11 levels for each factor. The

complete L 242 (11 23 ) is available at http://www.scis.ecu.edu.au/dli/.

The OA L 242 (11 23 ) was initially used in selecting a gene subset with 23

gene elements. The search space of 2000 genes in the colon data was

divided into 11 levels equally. If all of the genes are assigned a unique ID

number from 1 to 2000 and the initial search space ranges from 1 to 2000,

then the selected gene IDs are 1, 200, 400, 600, 800, 1000, 1200, 1400,

1600, 1800, and 2000, respectively. As the first row of L 242 (11 23 ) reads

(1, 10, 2, 3, 8, 8, 2, 4, 8, 9, 5, 4, 10, 5, 7, 1, 5, 5, 8, 1, 10, 11, 2), the con-

structed gene subset will read (1, 1800, 200, 400, 1400, 1400, 200, 600,

1400, 1600, 800, 600, 1800, 800, 1200, 1, 800, 800, 1400, 1, 1800, 2000,

200). Since the duplicated gene IDs are not allowed in a gene subset, those

repeated gene IDs are shifted forward or backward a little bit. The modified

23-gene subset now reads (1, 1800, 200, 400, 1400, 1399, 199, 600, 1401,

1600, 800, 599, 1799, 799, 1199, 2, 801, 798, 1401, 3, 1798, 2000, 201).

According to L 242 (11 23 ), 242 different 23-gene subsets were created

and evaluated with the defined objective function. All 242 subsets were

ranked based on their values of objective function. The top 10% perform-

ers in classifying the training set were kept, and those gene IDs included

in the top 10% gene subsets were ranked in order to work out the mini-

mum ID and the maximum ID. The new, reduced search space ranged

from the minimum ID to the maximum ID. The above process was

Search WWH ::

Custom Search

Home