Biology Reference
In-Depth Information
the assay instead of using an experimental sample. However, this merely
controls for the experimental detection process itself and not the differential
transcript data per se . An alternative approach though is the more reliable use
of whole-array normalization. Typically, whole-array normalization is per-
formed using linear or logarithmic regression techniques. 93-95 Whole-array
normalization relies upon a potentially flawed assumption, however, that is,
the majority of genes on the array are nondifferentially expressed between
the experimental states and that varying genes are not solely associated with
one of the fluorescent labels. The latter assumption can be checked easily by
dye-swapping paradigms in which fluorescent labels are reversed and exper-
imental data obtained again. As mentioned previously, the assumption that
there is only a minimal perturbation of the majority of the genes on the array
constructively reinforces our old concept of
linear, discrete signaling
pathways.
To further prepare microarray data for eventual functional analysis, it is
typical to apply a log transformation to the fluorescent data to make numer-
ical manipulation more acceptable. Parametric tests used for statistical anal-
ysis of the transcript variation are the most commonly utilized, as these tests
are much more sensitive and require the data to be normally distributed.
This is usually achieved by using log transformation of the spot intensities
to achieve a Gaussian distribution of the data. To extract the actual differ-
ential expression profile of genetic factors from microarray data, a ratio of
intensity (as a measure of expression level: z -ratio) between two samples
is used. As with all biological experiments, replicates of array data are
required if a fold-change cut-off of z -ratios is used as the primary data set
filter. Several model-based techniques have been developed that facilitate
the assumption of multiplicative noise and eliminate statistically significant
outliers from the data. 96 The typical parametric analytical methods applied
to primary gene array data management include maximum likelihood anal-
ysis, F -statistic, analysis of variance, and t -tests. As an alternative, nonpara-
metric tests used to analyze microarray data include Kruskal-Williams rank
analysis 97 and Mann-Whitney tests. 98 The primary goal of the initial statis-
tical analysis of the array data is the calculation of significance values for gene
expression. P -values, either fixed to 0.05 or 0.01 are therefore employed to
reduce the dataset to significantly regulated gene lists before z -ratio/fold-
change cut-offs are applied (typically 1.5), as well as provisions for false
data creation which are highly likely when large transcription arrays are used.
Protocols for the elucidation of random false results calculate the overall
chance that at least one gene is a false positive or negative, that is, the
Search WWH ::




Custom Search