Applications of the SDL Global Optimization Method in DNA Microarray Data Analysis - DNA Microarray Technology and Data Analysis in Cancer Research

Biology Reference

In-Depth Information

objective function, one will have discovered the best-performing gene

subset. This procedure can be made more sophisticated by introducing

weighting factors to increase the importance of user-specified samples in

training sets, as well as using other forms of the distance formula between

one subset and another.

5.3.3. Search space reduction for global search

With local optimization, a fast method for a large number of genes, the

program finds the nearest minimum and stops. For some so-called global

optimization procedures, the algorithm not only finds a local minimum,

but can also find some neighboring minima. The processes, however, is a

hit-and-miss situation because starting at a different place can result in

different solutions.

The global algorithm in SDL repeatedly narrows the region where the

global minimum is known to lie by using a special OA sampling that oper-

ates simultaneously in all orthogonal dimensions (one for each gene in the

gene subset) to find the optimum solution. As the process runs, one can

observe the range of genes for each gene variable in an n -dimensional

subset being reduced.

The SDL global optimization algorithm operates to discover the opti-

mum solution. An analogy illustrates the principles involved: assume

plotting the objective function against 2000 genes in the colon cancer data

with a goal of finding a gene or a gene subset corresponding to the maxi-

mum objective function value F (or 1/ F for the minimum value, for con-

venience in the illustration). See Fig. 5.10 for a one-dimensional (1D)

analogy showing local and global optimization processes.

As discussed earlier, a single objective function number can be used

to describe the classification performance of a current gene subset. By

plotting a multi-dimensional graph with objective function as one of

the axes, one can visualize the process. One requires as many orthogo-

nal axes as the number of variables (genes) plus one for the objective

function. Thus, for a two-gene problem, a three-dimensional (3D) plot

is required. To see the process used in a simplified form, imagine a

two-dimensional (2D) array along the x - and y -axes, which corresponds

Search WWH ::

Custom Search

Home