Database Reference
In-Depth Information
Raw expression data that either are down-
skin biopsies, reactions of human keratinocytes
to IL-1 and to search for differentially expressed
genes or gene clusters.
A so-called standard meta-analysis consists
of applying a set of statistical techniques to
combine and analyze results from multiple inde-
pendent datasets. Indeed while DNA micro-array
technology allows measuring simultaneously
the expression levels of thousands genes under
various conditions and provide genome-wide
insight, the weak number of samples (conditions)
in each study is a limit to the power of the sta-
tistical inferences. Therefore with the increasing
amount of public expression data, a new interest
appeared in combining independent datasets from
multiple studies in order to increase the sample
size and elicit more genetic markers (Moreau et
al., 2003). Growing volumes of experiments offer
new opportunities but amplify the challenging
statistical and computational complexity too. In
fact, when combining different data sets, one has
to consider at least data scales, distributions, and
sample similarity. Specific mathematical methods
to pre-process and transform data sets are neces-
sary to obtain a valid integrated data set. Some
recent attempts have been done to address and
solve new statistical issues raised when different
datasets are combined.
This section aims at presenting main proposed
solutions of standard meta-analyses on expression
datasets and their limits. It is structured in two
subsections. The first subsection presents diverse
recent meta-analyses methods and discusses
mainly integration and search for differentially
expressed genes issues. In the second subsection
we present conclusive tests that demonstrate how
carefully meta-analyses have to be driven.
loaded from public repositories (like CEL
files from GEO database) or produced by
local micro-array experiments,
Synthetic data obtained from numeric raw
expression data by processing transforma-
tion, statistical and data mining tools, and
also related statistical results downloaded
from public sites,
Related scientific publications selected by
investigators from public repositories like
PUBMED 15
Implicit background knowledge of the
investigators.
All these data are processed by AMI Analysis
Tools and/or Annotation Storing Tools . These
modules produce semantic annotations and re-
lational data that may be then queried by AMI
Querying Tools .
In summary , in this section, we have presented
examples of data analysis scenarios that demon-
strate the need of a powerful semantic querying
system. Then we have detailed the AMI frame-
work that provides capabilities to process either
semantic enhanced meta-analyses on synthetic
data or standard meta-analyses on raw data.
StAndArd MetA-AnAlySeS
of dnA MIcro-ArrAyS
Meta-analyses on multiple independent expression
data sets are one option to provide more com-
prehensive view for cross-validation of previous
results and comparison with novel analyses than
individual ones. For instance, for the particular
issue of human skin biomarkers identification, it
should be valuable to group together expression
data like GSE6281, GSE7216, GSE6475 and
GSE9120 series (see subsection “Description
of data sets”) respectively on Nickel allergy on
skin biopsies, reactions of human keratinocytes to
cytokines IL-19, IL-20..., inflammatory acne on
overview of Methods
For the purpose of demonstration, we first give
some examples of datasets that could be typically
involved in a meta-analysis. Then we present solu-
tions for addressing data integration issues, and
Search WWH ::




Custom Search