Biology Reference
In-Depth Information
as well as the effect of a single gene or a group of genes on the entire
genome (Abruzzo et al ., 2005; Bergmann et al ., 2003; Debouck and
Goodfellow, 1999; Marcotte et al ., 1999). Recent advances in biotechnol-
ogy allow researchers to measure expression levels for thousands of genes
simultaneously, across different conditions, and over a specific time period.
Analysis of data produced by such experiments offers potential insight
into gene functions and regulatory mechanisms (Abul et al ., 2005; Alizadeh
et al ., 2000; Allison et al ., 2006; Bowtell, 1999; Brown and Botstein,
1999; Cheung et al ., 1999; Duggan et al ., 1999; Lipshutz et al ., 1999; Perou
et al ., 1999).
Computation is required to extract meaningful information from the
large amount of data generated by expression profiling (Aittokallio et al .,
2003; Bassett et al ., 1999; Zhang and Gant, 2004). Most of the algorithms
commonly applied to microarray data analysis have been correlation-
based approaches named cluster analysis (Alon et al ., 1999; Cho et al .,
2004). An efficient two-way clustering algorithm was applied to a colon
cancer dataset consisting of the expression patterns of different cell types;
gene expression in 40 tumor and 22 normal colon tissue samples was ana-
lyzed across 2000 genes (Alon et al ., 1999). Cluster analysis groups genes
involved in microarray data that have similar expression patterns. Those
clustered genes are likely to be functionally linked and need to be looked
into closely. Although cluster analysis has been widely accepted in ana-
lyzing the patterns of gene expression, the methods developed may not be
able to fully extract the information from the microarray data corrupted by
high-dimensional noise. If the noise from the genes that are irrelevant is
not sufficiently reduced, incorrect classification for samples or misleading
information on selecting informative genes may result. To select inform-
ative genes for sample classification, a neighborhood analysis method was
developed to obtain a subset of genes that discriminates between acute
lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) suc-
cessfully (Golub et al ., 1999). In the microarray dataset containing 7129
genes, those genes whose expression levels differ significantly in ALL
and AML were identified and subsequently used to predict the class mem-
bership (either ALL or AML) of new leukemia cases.
Both approaches described above (Alon et al ., 1999; Golub et al .,
1999) were focused on comparing samples in each single-gene dimension,
Search WWH ::




Custom Search