Graphics Reference
In-Depth Information
T y . Nowwe assume the existence ofMVs. In PC regression,
the missing part y miss in the expression vector y is estimated from the observed part
y obs by using the PCA result. Let w l obs and w l miss be parts of each principal axis w l ,
corresponding to the observed and missing parts, respectively, in y . Similarly, let
W
y is given by x l = (
w l l )
= (
W obs ,
W miss )
where W obs or W miss denotes a matrix whose column vectors are
w obs ,...,
w obs or w miss ,...,
w miss , respectively.
Factor scores x
= (
x 1 ,...,
x K )
for the example vector y are obtained by mini-
mization of the residual error
2
err
=
y obs
W obs x
.
This is a well-known regression problem, and the least square solution is given by
W obsT W obs ) 1 W obsT y obs .
x
= (
Using x , the missing part is estimated as
y miss =
W miss x
(4.23)
In the PC regression above, W should be known beforehand. Later, we will discuss
the way to determine the parameter.
4.4.3.2 Bayesian Estimation
A parametric probabilistic model, which is called probabilistic PCA (PPCA), has
been proposed recently. The probabilistic model is based on the assumption that the
residual error
and the factor scores x l (
1
l
K
)
in Equation (reflinearcomb)
obey normal distributions:
p
(
x
) = N K (
x
|
0
,
I K ),
p
() = N D ( |
0
,(
1
/τ )
I D ),
where
N K (
x
| μ, Σ)
denotes a K -dimensional normal distribution for x , whose mean
and covariance are
μ
and
Σ
, respectively. I K is a
(
K
×
K
)
identity matrix and
τ
is a
scalar inverse variance of
. In this PPCA model, a complete log-likelihood function
is written as:
ln p
(
y
,
x
| θ)
ln p
(
y
,
x
|
W
,μ,τ)
=− 2
+
1
2
D
2 ln
K
D
2
2
τ
+
τ
,
y
Wx
x
ln 2
2
 
 
Search WWH ::




Custom Search