Biomedical Engineering Reference
In-Depth Information
could explore all the pairs and triads and perhaps tetrads of genes, and decide
their predictive power. This is the approach taken in (42), where the predictive
ability of all the sets with less than 4 genes is tested, and those sets that perform
the best (and above an error threshold) in a validation-by-classification task are
selected as genes of interest for further biological study.
Another multivariate algorithm that has been used in different applications
in the recent literature ((43-45); see also (39)) is Genes@Work, a gene expres-
sion pattern discovery approach. Genes@Work searches for patterns that differ-
entiate one particular phenotype from another phenotype chosen as a reference
or control. Each pattern consists of a group of genes observed to act consistently
over a subset of the samples in the phenotype set (formed by either the cases or
the controls). All subsets of genes and all subsets of experiments that satisfy
some given pattern parameters are searched. These patterns can be found in a
computationally efficient and exhaustive manner by algorithms that avoid
searching the complete combinatorial space of possible patterns (46). Each pat-
tern can be assigned a p -value, and the selected genes are the union of all the
genes that participate in at least one statistically significant pattern. Thus,
Genes@Work is an approach validated by statistical significance.
All of the methods presented in this section are interesting in that they inter-
rogate the data from different perspectives. In this sense, a method can rescue as
positives those genes that may have been left off as false negatives by other
methods. We will explore the value of combining different gene selection meth-
ods in the following section.
3.
COMBINING SELECTION METHODS PRODUCES A
RICHER SET OF DIFFERENTIALLY EXPRESSED GENES
In this section we describe the application of a combination of gene selec-
tion methods to identify interesting genes in lymphoma data. In particular, we
seek genes that differentiate between two types of lymphomas: diffuse large B-
cell lymphoma (DLBCL, the most common lymphoid neoplasm) and follicular
lymphomas (FL). FL is frequently characterized by a transformation to DLBCL,
and therefore a comparative study of the gene expression profiles of these two
lymphomas has been considered in the recent literature (47,48). In this section
we compare the gene expression profile of these two cancers to exemplify the
use of a combination of gene selection techniques.
Gene expression data for FL and DLBCL have been analyzed by Whitehead
Institute (WI) researchers in (49), where the 50 largest and positive scoring
genes (genes more expressed in DLBCL than in FL) and the 50 largest and
negative scoring genes (genes less expressed in DLBCL than in FL) were se-
lected using the signal-to-noise ratio method (SNR) described in ยง2.2.1. Each of
these 100 genes appear to have a statistical significance better than 1% when its
Search WWH ::




Custom Search