Digital Signal Processing Reference
In-Depth Information
In Factor Analysis, there are two problems:
1. Detection: given R , estimate Q . The hypothesis that the factor rank is q is denoted
by
H q .
2. Identification: given R and Q , estimate D and A ,or
s and U s .
We consider the latter problem first.
7.3
Computing the Factor Analysis Decomposition
Assume we know Q .Let
be a minimal parametrization of
(
A
,
D
)
, dependent on
AA H
Q , such that R
D . If we start from a likelihood perspective, we obtain
after standard derivations that the maximum likelihood estimate of R is obtained by
finding the model parameters
()=
+
such that
N ln
ˆ
() 1 R
=
|
() | +
(
)
.
arg min
R
tr
R
where R
1
N
N
n
H is the sample covariance matrix. This is exactly the
same problem as we saw before in ( 26 ) , and we can follow the same solution
strategy.
In particular, we can use the result from [ 31 ] that the ML problem is asymptoti-
cally (large N ) equivalent to the Weighted Least Squares problem
=
1 x
(
n
)
x
(
n
)
=
ˆ
C 1 / 2
2
H C 1
w
=
arg min
(
r
r
())
=
arg min
(
r
r
())
(
r
r
())
(38)
w
R
where as before r
=
vec
(
R
)
, r
=
vec
(
)
, and the weighting matrix C w is the
R
covariance of r , i.e., C w =(
. This is precisely in context of [ 31 ] , and
we can use the algorithms proposed there: Gauss-Newton iterations, the scoring
algorithm or sequential estimation algorithms.
It is also possible to propose an alternating least squares approach. Given an
estimate for D , then, as mentioned above, we can whiten R by D ,doaneigenvalue
decomposition on R
1
/
N
)(
R
)
D 1 / 2 RD 1 / 2 , and estimate A of size J
Q ,takinginto
account some suitable constraints to make A unique. For A known, the optimal D
in turn is given by diag
=
×
AA H
. Given a reasonable initial point (e.g., D ( 0 ) =
(
R
)
R
diag
), we can easily alternate between these two solutions. Convergence is to a
local optimum and may be very slow.
An alternative approach was recently proposed in [ 36 ] . The ML cost function is
shown to be equivalent to the Kullback-Leibler norm as often used in information
theory, and a suitable algorithm is the Expectation Maximization (EM) algorithm.
This is an iterative estimation algorithm which is shown in [ 36 ] to reduce, for current
estimates
(
)
(
A k ,
D k )
,to
Search WWH ::




Custom Search