Information Technology Reference
In-Depth Information
× n is the tensor product of multiplying a matrix on mode n . Each low-rank
matrix ( U
where
∈ R |U|× r U , I
∈ R |I|× r I , T
∈ R |T|× r T ) corresponds to one factor. The
r T contains the interactions between the different factors.
The ranks of decomposed factors are denoted by r U ,
r U ×
r I ×
core tensor
C ∈ R
r I ,
r T and Eq. ( 2.2 )iscalled
rank-
Tucker decomposition. An intuitive interpretation of Eq. ( 2.2 )is
that the tagging data depends not only on how similar an image's visual features and
tag's semantics are, but also on how much these features/semantics match with the
users' preferences.
Typically, the latent factors U , I , T can be inferred by directly approximating
(
r U ,
r I ,
r T )
Y
and the tensor factorization problem is reduced to minimizing an point-wise loss on
Y
:
2
min
U , I , T , C
( ˆ
y
, t
y
, t )
(2.3)
, i
, i
u
˜
˜
u
( u , i , t ) ∈|U|×|I|×|T|
where
t . As this optimization scheme tries to fit to the
numerical values of 1 and 0, we refer it as the 0/1 scheme . To alleviate the sparse
problem and better utilize the tagging data, in this chapter, we propose RMTF for
factor inference, which is detailed in Sect. 2.3.1 .
Tag Refinement. From the perspective of subspace learning, the derived factor
matrices U , I , T can be viewed as the feature representations on the latent user ,
image , tag subspaces, respectively. Each row of the factor matrices corresponds to
one object (user, image or tag). The core tensor
ˆ
y
= C × u u
× i i
i × t t
, i
u
˜
, t
u
˜
defines a multilinear operation
and captures the interactions among different subspaces. Therefore, multiplying a
factor matrix to the core tensor is related to a change of basis. We define
C
UI
T
:= C × t T
(2.4)
UI
r U ×
r I ×|T| can be explained as the tags' feature representations on
then
T
∈ R
the user
×
image subspace. Each r U
×
r I slice of matrix corresponds to one tag
UI over the user dimensions, we can obtain
the tags' representations on the image subspace. Therefore, the cross-space image-tag
association matrix X IT
feature representation. By summing
T
∈ R |I|×|T| can be calculated as 3 :
× u 1 r U )
X IT
UI
=
I
· (T
(2.5)
The tags with the K highest associations to image i are reserved as the final annota-
tions:
3 In practice, for new images not in the training dataset, we can approximate their positions in the
learnt image subspace by using approximated eigenfunctions based on the kernel trick [ 2 ].
Search WWH ::




Custom Search