Biomedical Engineering Reference
In-Depth Information
Besides, FLTM models seem appealing to enhance association studies: due to their
hierarchical structure, FLTM models would help pointing out a region containing a ge-
netic factor associated with a studied disease. However, bottom-up information fading is
likely to be observed in the hierarchical structure. The impact on downstream analyses
such as association studies remains questionable. The very point is to check whether
latent variables covering a causal region are found associated with the disease. This
chapter conducts a systematic and comprehensive evaluation of the ability of the FLTM
model to help evidence genetic associations through latent variables. In this framework,
we address the case of the single causal genetic factor.
The organization of the chapter is as follows. After the Section ”Definitions”, Sec-
tion 3 provides the motivation for the FLTM model proposal, together with the con-
text of this proposal: the FLTM-based contribution is first put into the perspective of
PGM-based works addressing LD modeling; then the contribution is considered from
the viewpoint on LTM learning. The next Section describes the specific FLTM learning
algorithm developed. Section 5 briefly outlines the advantages of FLTM, confirmed by
evaluation. In particular, this Section highlights three advantages of the FLTM model,
which are crucial to detect genetic associations: scalability, faithfulness in LD mode-
ling, high data dimension reduction. In Section 6, the notion of ”indirect association”
is defined; then is detailed the protocol implemented for the methodical evaluation of
FLTM latent variables' ability to capture indirect associations. Section 6 discusses the
results of intensive tests run on realistic simulated data; finally, tests applied on real
genotypic data are also shown.
2
Definitions
In the context of this chapter, we restrain our concern to discrete and finite variables
(either observed or latent).
Definition 1 (Conditional Independence) . Given a subset of variables S ⊆ X \{X i ,X j } ,
conditional independence between X i and X j given S ( X i ⊥⊥
X j |
S ) is defined as:
P
( X i ,X j |
S )=
P
( X i |
S )
P
( X j |
S ) . The non-equality entails that both variables are
conditionally dependent given S .
Definition 2 (Bayesian Network) . BNs are defined by a directed acyclic graph G ( X,E )
and a set of parameters θ . The set of nodes X =
represents p random
variables and the set of edges E captures the conditional dependences between these
variables (i.e. the structure). The set of parameters θ describes conditional probability
distributions θ i =[
{
X 1 , ..., X p }
( X i /Pa X i )] where Pa X i denotes node i 's parents. If a node has
no parent, then it is described by an a priori probability distribution. The variables are
described for n observations. X is a BN with respect to G if it satisfies the local Markov
property stating that each variable is conditionally independent of its non-descendants
given its parent variables: X i ⊥⊥
P
X
\
desc( X i )
|
Pa X i
for all i
∈{
1 , ..., p
}
where
desc ( X i ) is the set of descendants of X i .
Due to the local Markov property, the joint probability distribution writes as a product
of individual distributions, conditional on the parent variables:
( X )= i∈{ 1 ,...,p}
P
θ i .
Search WWH ::




Custom Search