Biomedical Engineering Reference
In-Depth Information
of the physical constraint δ . The physical constraint imposed by the sliding window size
δ allows to adjust the variable bandwidth of the sparse dependence matrix.
It has to be noted that, unlike SNPs, latent variables are not characterized by a phys-
ical location on the chromosome. In this specific case, the locations of the SNPs sub-
sumed by a given latent variable are averaged to provide the location of this latent
variable.
Data Imputation for Latent Variables. Data imputation is processed locally, that is
considering the LCM rooted in the latent variable and whose leaves are the variables
in the cluster. For simplification, the cardinality of the latent variable is estimated as
an affine function of the number of leaves. Parameter learning is first performed in
this LCM, through the EM algorithm. This step yields the marginal distribution of the
latent variable and the conditional distributions of the child variables. Therefore, (linear)
probabilistic inference can be carried on, based on the following principle:
Π i =1 P
( x i |
H = c )
P
( H = c )
x j )=
P
( H = c
|
,
c =1 Π i =1 P
( x i |
H = c )
P
( H = c )
with k the cardinality of latent variable H , c a possible value for H , j an observation,
i.e. an individual, and x j the vector of values
x j
1
, ..., x p }
{
corresponding to the variables
in the cluster
{
X 1 , ..., X p }
.
Local Parameter Learning. In parallel with structure growing, the parameters of the
forest of LTMs are learned locally (see Subsection 4.2). At a given iteration, for any
variable identified as a leaf node in an LCM (corresponding to a cluster), the current
marginal distribution of this variable is replaced with its conditional distribution learnt
in the LCM. Thus, during the bottom-up construction of the FLTM, marginal distribu-
tions are successively replaced with conditional distributions.
Validation of Latent Variables. The subsumption of the candidate cluster into the
latent variable H is validated through a criterion averaging a normalized dependence
measure between H and each of H 's child nodes:
1
I
( X i ,H )
Criter =
( H ))
τ latent ,
|
C H |
min (
H
( X i ) ,
H
i
C H
with
|
C H |
the size of cluster C H .
4.3 Role of Parameters
In the forest of LTMs, the subsumption process is controlled through thresholds
τ pairwise and τ latent , and constraint δ . No latent variable is allowed to subsume va-
riables which are not highly pairwise dependent ( τ pairwise ) or which are in regions
too far from one another ( δ ); τ latent controls bottom-up information fading through the
hierarchy. τ pairwise , τ latent and δ thus monitor the number of connected components
(trees) and the number of layers in the forest. These three parameters rule the trade-off
between faithfulness to the underlying reality and tractability of the modeling.
 
Search WWH ::




Custom Search