Computational Intelligence Techniques for Classification in Microarray Analysis - Computational Intelligence in Healthcare: Advanced Methodologies

Information Technology Reference

In-Depth Information

eliminates defective samples and standardizes the data. This phase is normally di-

vided into 3 sub-phases: background correction, standardization, and summarization.

There is currently a limited group of algorithms that investigators use for performing

these steps. The most common are MAS5.0 [30] (Microarray Affymetrix Suite 5.0),

PLIER [31] (Probe Logarithmic Intensity Error), and RMA) [32] (Robust Multi-array

Average).

The RMA [32] algorithm is method for normalizing and summarizing probe-level

intensity measurements. It analyzes the values for the PM (Perfect-Match): in the first

step, a Background Correction is carried out to remove the noise from the averages of

the PM; in the second step, the data is quantile normalized in order to compare data

from different microarrays; finally, a summarization is made and the values for each

probe-set are generated.

4.1.2 Irrelevant Probes

Once the control and the erroneous probes have been eliminated, the filtering process

begins. The first step consists of eliminating the probes marked as irrelevant in previ-

ous executions of the CBR cycle. This way, all probes that can pass the filtering

phase, but are prone to cause erroneous results during the reuse phase, are removed.

4.1.3 Variability

The second stage is to remove the probes that have low variability. This work is car-

ried out according to the following steps:

1. Calculate the standard deviation for each of the probes j

(

)

(1)

∑

−

n 1

Where n is the total number of cases,

is the average population for the variable j,

and i x is the value of the probe j for the individual i.

2. Standardize the above values

−

(2)

(

)

Where ∑

∑

−

z i

≡

(

and

where

n 1

3. Discard probes for which the value of z meets the following condition:

z . This will achieve the removal of about 16% of the probes if the

variable follows a normal distribution.

−

4.1.4 Uniform Distribution

Finally, all remaining variables that follow a uniform distribution are eliminated. The

variables that follow a uniform distribution will not allow the separation of individu-

als. Therefore, the variables that do not follow this distribution will be really useful

Computational Intelligence in Healthcare: Advanced Methodologies

Search WWH ::

Custom Search

Home