Biology Reference
In-Depth Information
5. Spectrum of Possible Analyses
The possibilities for combining information across studies can be viewed
as occurring along a spectrum of levels of analysis, moving roughly from
a combination of least to most “processed” quantities — that is, in order
of decreasing information content:
(1)
pooling raw data;
(2)
pooling adjusted data;
(3)
combining parameter estimates;
(4)
combining test statistics;
(5)
combining transformed p -values;
(6)
combining statistic ranks; and
(7)
combining decisions (e.g. via intersecting Venn diagrams).
5.1. Pooling Raw Data
One way of combining information across studies is to pool the raw, unad-
justed data and analyze them together as a single data set. This approach
is sometimes called a “mega-analysis”. Here, a (fixed or random) covari-
ate indicating study origin can be included in an overall model.
If the different datasets are sufficiently homogeneous and all measure
relevant covariates for adjustment, this strategy might be viable. However,
even when the raw data are available, this method has a number of draw-
backs, particularly for microarray data. It is generally inappropriate to pool
raw data from heterogeneous studies (e.g. Simpson's paradox 8 ). If com-
puting power is limited, pooling raw data may not even be feasible. It is
difficult to imagine a microarray study for which this would be the
method of choice. Even planned multi-site studies often exhibit site-
specific effects, for which adjustment of some type is required.
For example, even when using the same chip in different studies,
joint normalization of pooled data typically does not remove the study
batch effect. 9,10 Obviously, this problem becomes even worse with the use
of different arrays in different studies, and does not even make sense in
the case of data from heterogeneous assays, where signals are noncom-
mensurable.
Search WWH ::




Custom Search