Biology Reference
In-Depth Information
as well as the effect of a single gene or a group of genes on the entire
genome (Abruzzo
et al
., 2005; Bergmann
et al
., 2003; Debouck and
Goodfellow, 1999; Marcotte
et al
., 1999). Recent advances in biotechnol-
ogy allow researchers to measure expression levels for thousands of genes
simultaneously, across different conditions, and over a specific time period.
Analysis of data produced by such experiments offers potential insight
into gene functions and regulatory mechanisms (Abul
et al
., 2005; Alizadeh
et al
., 2000; Allison
et al
., 2006; Bowtell, 1999; Brown and Botstein,
1999; Cheung
et al
., 1999; Duggan
et al
., 1999; Lipshutz
et al
., 1999; Perou
et al
., 1999).
Computation is required to extract meaningful information from the
large amount of data generated by expression profiling (Aittokallio
et al
.,
2003; Bassett
et al
., 1999; Zhang and Gant, 2004). Most of the algorithms
commonly applied to microarray data analysis have been correlation-
based approaches named cluster analysis (Alon
et al
., 1999; Cho
et al
.,
2004). An efficient two-way clustering algorithm was applied to a colon
cancer dataset consisting of the expression patterns of different cell types;
gene expression in 40 tumor and 22 normal colon tissue samples was ana-
lyzed across 2000 genes (Alon
et al
., 1999). Cluster analysis groups genes
involved in microarray data that have similar expression patterns. Those
clustered genes are likely to be functionally linked and need to be looked
into closely. Although cluster analysis has been widely accepted in ana-
lyzing the patterns of gene expression, the methods developed may not be
able to fully extract the information from the microarray data corrupted by
high-dimensional noise. If the noise from the genes that are irrelevant is
not sufficiently reduced, incorrect classification for samples or misleading
information on selecting informative genes may result. To select inform-
ative genes for sample classification, a neighborhood analysis method was
developed to obtain a subset of genes that discriminates between acute
lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) suc-
cessfully (Golub
et al
., 1999). In the microarray dataset containing 7129
genes, those genes whose expression levels differ significantly in ALL
and AML were identified and subsequently used to predict the class mem-
bership (either ALL or AML) of new leukemia cases.
Both approaches described above (Alon
et al
., 1999; Golub
et al
.,
1999) were focused on comparing samples in each single-gene dimension,
Search WWH ::
Custom Search