Biology Reference
In-Depth Information
Different Tissues
Tables such as these are called gene expression matrices. For a typical gene
expression experiment, the matrices would include the values from a
number of arrays, representing different patients, treatments, or cell
types, as depicted in Figure 12-8. Mathematically, the information can be
stored as a matrix
Tissue j
x ij
0
1
x 11
x 12
x 1n
...
@
A
x 21
x 22
x 2n
...
x
ΒΌ
(12-1)
x m1
x m2
x mn
...
of size m
n where m is the number of genes and n is the number of
different tissues, with x ij denoting the value assigned to the i-th gene in
the j-th sample (Figure 12-8).
FIGURE 12-8.
Gene expression matrix for a comparative tumor
gene expression study. Each row represents the
expression values for a different gene, and each
column represents the values for a different tumor
sample.
Thus far, we have only focused on the conceptual side of the
hybridization experiment, leaving the experimental details aside and
making some implicit assumptions. For example, we have assumed that
equal amounts of mRNA were obtained from both cell types. We have
also assumed that the cancer cell cDNA, incorporating Cy5, was
labeled to the same degree as the normal cell cDNA, incorporating Cy3.
Finally, we have assumed that the two fluorescently labeled cDNAs are
detected with equal efficiency. Experimentally, there are numerous
reasons that may cause these assumptions to not be true, resulting in
systematic bias and providing sources for systematic variance in the
gene expression levels across experiments. Thus, compensatory
techniques are necessary to remove bias and make the experimental
results comparable.
One such technique, called normalization, allows the results to
be adjusted to compensate for a systemic problem (bias) in the data
caused by technical variations. For instance, this technique can be used
to compare data from different arrays or different color channels.
Normalization procedures require a set of genes to be used as a basis for
comparison. The procedures may use the set of all genes on the array
and measure an aggregate characteristic, such as total fluorescence
intensity. Alternatively, normalization may look at a subset of the genes
in the experiment. These may be housekeeping genes, which should be
expressed equally in all of the cell types under study. The experiment
may also include artificially introduced controls (such as bacterial
genes introduced into a mammalian expression assay) which may be
used as a normalization set.
Another type of normalization, pertinent to the clustering techniques
examined below, is gene normalization across tissue samples. This is
done to adjust for different scales of expressions. Assume for example,
that the gene expressions of five genes, denoted by A, C, D, E, and F,
have been measured in four different tissue types and the results plotted
in Figure 12-9(A). Notice that genes A and C are co-regulated across
Search WWH ::




Custom Search