Information Technology Reference
In-Depth Information
Table 4.
Statistically overrepresented terms from the two-component probes in
GDS596
Terms GO p-value
cytoskeleton GO:0005856 4.28e-3
non-membrane-bound organelle GO:0043228 8.3e-3
organelle part
GO:0044422 6.33e-2
intracellular
GO:0005622 1.17e-1
cytoplasm
GO:0005737 1.75e-1
Table 5.
Statistically overrepresented terms from the two-component probes in
GDS592
Terms GO p-value
proteinaceous extracellular matrix GO:0005578 6.18e-2
intracellular
GO:0005622 5.48e-2
midbody
GO:0030496 9.85e-2
cell soma
GO:0043025 9.33e-2
extracellular space
GO:0005615 1.16e-1
observation where the D statistics of lognormal is smaller than gamma by order
of magnitude. Note that the likelihood in tables 2 and 3 does not alway improve
monotonically with the number of components. We believe this is due to the EM
procedure getting trapped in poor quality local optima when more components
are used.
Marginal Distribution of CSN3 Gene. CSN3 is a component of the COP9
signalosome complex, a complex involved in signal transduction. Figure 3 shows
examples how one and two components mixtures of normal, lognormal, and
gamma fit the marginal distribution of these gene. It is taken from GDS596
dataset.
Gene Ontology Comparison. From the two datasets we identified two com-
ponent genes, the genes with probes for which the BIC criteria suggested two
components for the learned lognormal mixture (there are 256 such genes for
GDS596 and 56 for GDS592). We examined the overrepresented terms for each
datasets using the web tool Babelomics ( http://www.babelomics.org/ ) [14].
One common cellular component term - intracellular - that occurs in both of
the datasets.
4 Conclusions
In this paper we provide a statistical framework using normal, lognormal, and
gamma mixture models for analyzing the marginal distributions of expression
Search WWH ::




Custom Search