Database Reference
In-Depth Information
Due to nonnegativity of X,Y, large x ij 's indicate that the j th basis vector
essentially represents the value of the i th element of a j . Correspondingly, large
values of y ij 's indicate a major contribution of the i th basis vector to the element a j .
As compared with SVD, of course, we need to trade off the advantages against
the disadvantages: NMF typically exhibits a worse rank- k approximation than SVD.
Moreover - as opposed to SVD - the approximation is not unique. While the left
singular vectors of SVD are mutually orthogonal, this does not hold for the basis
vectors x i of NMF in general.
A major drawback is the lack of efficient computational methods for NMF that
are suitable for high-dimensional problems. While there are technically mature
standard procedures for SVD computation, like the Lanczos method, as well as
adaptive procedures like that of Brand's, and their convergence is proven, methods
for NMF are still in their infancy and poorly understood.
Even though the problem ( 8.28 ) is convex in each of X and Y, it is not convex in
both variables taken together, which renders determining a globally optimal
solution difficult. For most existing algorithms, convergence to a locally optimal
solution has been shown at best.
Most NMF optimization procedures follow the EM ( expectation - maximization )
principle in alternatingly fixing one of the variables and obtaining a new iterate by
optimizing with respect to the other variable.
A typical procedure is the ALS algorithm ( alternating least squares ), which is
considered to be an important practical tool for computation of an NMF. It should
be noted that we may introduce a column normalization immediately after step 6, so
as to ensure that the column sums up to 1. With regard to the case of a factorization
of transition probabilities discussed in the previous section, we thus automatically
fulfill the conditions ( 8.26 ) and ( 8.27 ). Hence stochasticity of the factorized matrix
is preserved.
Algorithm 8.3 ALS
mn
Input: positive matrix A
∈R
mk
kn
Output: positive factor matrices X
∈R
,
Y
∈R
1: initialize X with small random values
2: repeat
3: Y : ¼ ( X T X ) 1 X T A ¼ X + A
4: Y : ¼ [ Y ] +
5: X : ¼ AY T ( YY T ) 1
¼ AY +
6: X : ¼ [ X ] +
7: until convergence
Despite its good practical performance, we cannot guarantee even local conver-
gence of ALS. There is a vast variety of further algorithms for NMF. At present,
researchers are working flat out to make these methods suitable for high-
dimensional problems. The results, however, are still hard to assess.
Search WWH ::




Custom Search