Biomedical Engineering Reference
In-Depth Information
a properly aligned grid. For an Affymetrix GeneChip microarray, the gene expres-
sion level is determined by calculating the hybridization signals of all the PM-MM
pairs.
Several methods have been used for the analysis of Affymetrix microarrays. The
empirical method is the first generation method provided by Affymetrix. It simply
takes the difference of the mean of PM and MM average. This method can produce
negative values, which makes no sense in biology. The statistical method [28]
weighs the probe pairs based on their performance. Affymetrix has recently devel-
oped a model-based method, probe logarithmic intensity error (PLIER) estimation
[29, 30]. The new algorithm provides higher signal reproducibility and higher
expression sensitivity for low expressors. Other model-based methods [31-33] have
been developed so that weighing on the probe pairs is determined by examining a
group of samples. An approach has also been developed that uses only the PM to
determine gene expression [31]. The correlation coefficient between microarrays
using all genes on an array can be used to reveal the disparity of array quality and
identify problematic arrays [34].
9.3.1.4 Data Normalization and Transformation
Several approaches have been developed to normalize microarray data so that they
can be compared. The use of housekeeping genes, such as GAPDH and
-actin,
requires additional steps and assumes that these housekeeping genes are expressed
at the same level in all samples or experimental conditions. However, user needs to
keep in mind that housekeeping genes can indeed change from one condition to
another. Data normalization can be done using spike genes. In such a case,
RNA/cDNA of known concentration is added to each sample and used as the refer-
ence for normalization. Global normalization assumes that the majority of the
genes on the array are not differentially expressed so the means of all arrays are the
same. Both linear regression and mean/median methods can be used. For arrays that
do not have many genes (e.g., custom arrays that feature a very small number of
genes), global normalization is not the choice.
Data transformations are necessary for downstream statistical analysis because
most statistical tests assume a certain type of data distribution. The type of statisti-
cal analysis (parametric or nonparametric) determines how the data should be
transformed. The methods include binning for nonparametric tests, inversion, loga-
rithm, and z -score transformations for parametric tests. Parametric analysis is most
commonly used and requires the data to be normally distributed. Log transforma-
tion is commonly used to achieve Gaussian distribution. However, it should not be
used if the downstream analysis relies on a distance measure.
β
9.3.1.5 Statistical Analysis
As mentioned earlier, in a sense, finding differentially expressed genes is the goal of
microarray experiments. These genes are identified by a difference (sometime
expressed as a ratio) in the expression level within experimental groups. In some
cases, the number of cases in experimental groups is very limited and it is not possi-
ble to validate the sample distribution. Fold-change is considered a common way
Search WWH ::




Custom Search