Graphics Reference
In-Depth Information
Figure
.
.
Aligned expression of the gene TOP
in the alpha, cdc
and cdc
data sets
complexgenetic networks tobereconstructed. herefore,the integration ofmicroar-
ray data sets from homogeneous or similar, but not identical, experimental condi-
tions is of interest. Bar-Joseph et al. (
)developed an algorithm to align data sets;
forexample, theyaligned alpha,cdc
and cdc
inSpellmanetal.(
).hecDNA
microarray gene expression data were from synchronized yeast cells (red-channel
intensities) versus nonsynchronized ones (green-channel intensities that served as
background signals). Note that all data sets except for cdc
were processed by nor-
malization procedures in Yang et al. (
); cdc
data were provided in log ratios
only, so normalization could not be applied. Figure
.
depicts the curves of a given
gene's expression (in log ratios) from the three aligned data sets. Consistency across
all three curves (data sets) supports the validity of the data, whereas any inconsis-
tency in one curve with respect to the other curves suggests a potential outlier. For
example,inFig.
.
the geneexpression of TOP
at
minutes inthecdc
data setis
likely tobeanoutlier, sinceitisvery different toitscorresponding points inthealpha
and cdc
data sets. In addition, the small ups and downs at the fith and later points
of the cdc
data set indicate that the data quality of the cdc
is worse than that of
thealphaandthecdc
data sets.Hence,aligning data setswithsimilarexperimental
conditions provides a route to the detection of outliers or noisy data visually; it also
helps to exclude patterns suggested by contaminated data.
Data Augmentation
1.2.2
here are
,
,
and
time points without replicates in the alpha, cdc
, cdc
(originally from Cho et al.,
), and Elu microarray data sets in Spellman et al.
(
). To augment data for inference, Xie and Bentler (
) integrated these four