Decomposition in Transition II: Adaptive Tensor Factorization - Realtime Data Mining

Database Reference

In-Depth Information

Most of the current procedures for computing tensor factorization are EM-like

(comparable to ALS from Sect. 8.4.3 ). Therein, in each step a part of the tensor

coefficients is kept fix, such that the resulting optimization problem with respect to

the “free” variables is convex and thus can be solved easily. After solving the latter,

the just-obtained coefficients are fixed in turn, and previously fixed coefficients are

computed anew. The method terminates as soon as the error has fallen below a

prescribed bound, or other technical termination criteria are satisfied (e.g., a

maximum number of iterations has been reached). Methods of this type mostly

turn out to be robust in practice, but often their convergence cannot be ensured. For

canonical decompositions ALS in general converges slowly (unlike as for the

Tucker decomposition where ALS mostly works efficiently).

Additionally, for non-Tucker decompositions, there may be a problem with the

stability of ranks. For example, small perturbations can considerably decrease the

canonical rank [DSL08]. Thus, in general CP decompositions are not an optimal choice.

All this brings us back to the Tucker decomposition again. The method for

computing the HOSVD presented in Sect. 9.1.3 and its adaptive version from

Sect. 9.1.4 reduce the HOSVD to a singular value decomposition of matrices.

Despite all complexity, the latter has been studied comprehensively and is numer-

ically controllable. Therefore, the presented procedure may be considered robust.

First, let us estimate the complexity of the Tucker decomposition ( 9.1 ) in more

detail. For simplicity, for now and in the following, we consider all mode ranks of

the Tucker rank-t to be equal, i.e., t k ¼ t , k

d , and now refer to t as Tucker rank.

Similarly, for complexity estimates we will assume that all mode sizes are equal:

n k ¼ n , k

∈

d . Then the number of parameters of ( 9.1 )is O ( dnt + t d ). This is

acceptable for moderate dimensions, like 3 or 4, granted rank- t is not too high. So

if we have an efficient algorithm that can handle large mode sizes (unlike the

truncated SVD applied to all k -mode matricizations), it can be applied to large

problems. In fact, such algorithms are available, e.g., the cross-approximation

[OST08].

Nevertheless, for larger dimensions the classical Tucker decomposition ( 9.1 )is

definitely not suitable because of the complexity O ( t d ) of the core tensor. This

problem is solved by the introduction of hierarchical SVD-based decompositions

that will be discussed in the next section. Here, once again, the power of hierarchi-

cal approaches, which we already addressed in Chap. 6 , shows up.

∈

9.3 Hierarchical Tensor Factorization

9.3.1 Hierarchical Singular Value Decomposition

The H-SVD, where “H” stands for “hierarchical,” is being discussed in current

research [Gra10, HK09]. The main idea behind hierarchical Tucker decompositions

is simple: in order to reduce the complexity of the core tensor C, we use a hierarchical

split of the set of dimension indices and observe a dyadic decomposition. Let us

Realtime Data Mining

Search WWH ::

Custom Search

Home