Database Reference
In-Depth Information
3. Paatero's penalty function approach (29; 28), and
4. Lee-and-Seung-style (24) multiplicative updates (39; 32; 20).
The first class is not recommended because one does not obtain least squares
estimates, meaning that the residual error may increase. Hence, when employ-
ing such a technique in an iterative, multi-way algorithm such as PARAFAC-
ALS, the algorithm may actually diverge (10). The three remaining classes of
algorithms have better convergence properties, and nonnegative least-squares
approaches solve a bound-constrained linear least squares problem. Paatero's
PMF3 algorithm (28) uses a logarithmic penalty function and solves for all
modes simultaneously using a Gauss-Newton approach, which enjoys fast con-
vergence but is slower on larger problems. The multiplicative update is ap-
pealing because it is simple and fast to program, scales well with very large
datasets, but it can be slow to converge.
With the exception of Paatero's PMF3, each approach harkens back to
PARAFAC-ALS except that the factor matrices are updated differently. Each
method generally relies on the fact that the residual norm of the various matrix
formulations of the PARAFAC model are equal:
X ( m×npq )
B ) T
||
A ( D
C
|| F =
||
X ( n×pqm )
B ( A
D
C ) T
|| F =
X ( p×qmn )
D ) T
||
C ( B
A
|| F =
X ( q×mnp )
A ) T
||
D ( C
B
|| F .
Each of these matrix systems may be treated as a separate nonnegative
factorization problem using the techniques mentioned previously and solved
in an alternating fashion.
For example, Friedlander and Hatz (16) solve each subproblem as a bound
constrained linear least-squares problem. They impose sparseness constraints
by regularizing the nonnegative tensor factorization with an l 1 -norm penalty
function. While this function is nondifferentiable, it effectively removes small
values yet keeps large entries. While the solution of the standard problem is
unbounded (due to the indeterminacy of scale), regularizing the problem has
the added benefit of keeping the solution bounded.
Alternatively, Welling and Weber (39), and subsequently others (32; 20;
15; 27), update A using the multiplicative update introduced in (24) while
Search WWH ::




Custom Search