Biology Reference
In-Depth Information
are more often adjusted to be near-isothermal, the background hybridization noise
compared to targeted hybridization signal tends to be lower and background correc-
tion is sometimes neglected ( Grant et al. , 2007 ). For instance, the standard signal
correction method for Agilent expression arrays is “spatial detrending” without
background subtraction which consists of two steps: calculating a surface fit that cap-
tures the spatial trends of the expression signal (the “foreground”) and subtracting the
surface fit from the data in order to correct for unwanted spatial heterogeneity.
Details of the spatial detrending algorithm are available in the Agilent Feature
Extraction v10.7 Reference Guide.
Between-array normalization is often a difficult but important step in one-colour
array data preprocessing ( Bolstad et al. ,2003 ). The goal here is to filter out the obvious
differences in the distribution of the intensity levels from one array to another. These
arise fromvariations in the mRNA extraction, labelling, hybridization (including label-
ling efficiency and amount of cDNA deposited), washing and also image acquisition
(sensitive to the scanner settings). For this purpose, we generally apply some transfor-
mation to the log2-scale intensity levels tomake them comparable between arrays. The
two most popular transformations are the adjustment by a scaling factor (for instance
median subtraction on the log2 transformed data) that is a relatively mild transforma-
tion and the quantile normalization that consists of matching not only the medians (i.e.
the 50%-quantile) but also the whole distributions (i.e. all the quantiles) between sam-
ples. These two transformations are global in the sense that the transformation depends
on the whole distribution and will therefore tend to smooth out the global trends in the
distribution of expression levels. However, this should not be interpreted as removing
most of the differences between conditions. Indeed all genes can be differentially
expressed, even when the distribution of expression levels is identical. Nevertheless,
when global trends are expected, for instance, when experimentally depleting the
amount of an RNase and therefore increasing the half-lives of a large number of
mRNAs, we would expect most direct effects to be positive and an aggressive global
normalization such as quantile normalization may be inappropriate (see Durand et al. ,
2012 ). In this case, it can be useful to select a subset (for instance, 10%) of genes that
exhibit the smallest variations and learn the transformations making the distribution of
expression levels of this subset of genes match across hybridizations. Such procedures
are often termed “invariant set” normalizations and can provide satisfying results.
However, if one anticipates that global normalization will be inappropriate, the best
solution is provided by experimental spike-in of the sample before labelling, as pointed
out in Section 2.2 . The spike-in signals of the corresponding probes on the array then
serve to adjust the transformation applied to the expression levels of each sample. It
should be noted, however, that spike-in normalization cannot account for differences
in the quality of mRNA and precision of the amount of added spike-in material may be
limiting. A final remark about between-array normalization is that it should always put
hybridizations from the same and from different conditions on an equal footing: sep-
arate normalization for the different conditions would artificially shrink the differences
between replicates and amplify the differences between conditions, making the data
misleading.
Search WWH ::




Custom Search