Database Reference
In-Depth Information
9.2.3 Problems of Tensor Factorizations
There exist many other tensor decomposition methods. However, the main problem is
that for most of them, unlike as for the Tucker decomposition, no efficient standard
algorithms to calculate the decomposition exist. So these algorithms need to be
developed.
To this end, we shall denote the sought-after decomposition of a given tensor
A by A Θ with
denoting the sought-after tensor components, e.g., the factors
U of the CP-decomposition. Thus, the task consists in determining an optimizer
Θ
Θ
of
min
Θ
fA
ð
;
A Θ
Þ ,
mostly in the form
min
Θ
k
A A Θ
k F :
Iterative procedures appear suitable for the solution of this optimization, since
they are optimal with respect to memory usage with regard to extremely sparsely
populated tensors. Their convergence rates, however, are critical.
In current literature, endless varieties of gradient descent methods (stochastic,
partitioned, prioritized, etc.) are often proposed. Mostly, a case for them is their easy
implementation. These procedures are then praised as “rapid” and “robust”; estimates
of their convergence rates are generously forgone. Of course, the exact opposite of
the promised holds true: gradient descent procedures heavily depend on the problem
and parameters and often converge rather slowly. Granted, they may now and then be
“fine-tuned” to the solution of a particular high-dimensional problem instance, but
they fail at solving other, even tiny, problems with the same parameter assignment.
All in all, there is no satisfactory guaranteed upper bound on the number of iterations
(and thus on the computational complexity) for any general class of problem
instances. Thus, gradient descent methods are suitable for certain experiments, but
not for practical deployment.
To ensure the necessary convergence rate, more sophisticated iteration procedures
are required, e.g., projection methods. This may be seen very clearly in the field of
matrices, where Krylov-subspace procedures are employed predominantly. For sym-
metric matrices, this mainly comes to the conjugate gradient (CG) method, mostly in
connection with a preconditioner (PCG), which may arguably be considered as the
default procedure of numerical analysis. For nonsymmetric matrices, especially for
the SVD, one mostly resorts to the Lanczos method, which has been presented in
Sect. 8.4.1 . Devising robust Lanczos procedures, though, is difficult and cannot be
immediately carried over to other formulations, as it requires orthogonality of the
bases. Yet there are orthogonalized approaches to the higher-order case but mostly,
again, for the Tucker case. An example is the High-Order Orthogonal Iteration
[KB09].
Search WWH ::




Custom Search