Information Technology Reference
In-Depth Information
The affinity graphs are utilized as the regularization terms to impose smoothness
constraints for the latent factors. All the affinity graphs are normalized. Take the
image affinity graph W I as an example, the regularization term is:
|I|
|I|
W I m , n ||
2
i m
i n ||
(2.20)
m
=
1
n
=
1
2 denotes the Frobenius norm. The basic idea is to make the latent repre-
sentations of two images as close as possible if there exists strong affinity between
them. We can achieve this by minimizing tr
where
|| · ||
I L I I
denotes the trace of
a matrix and L I is the Laplacian matrix for the image affinity matrix W I . Similar reg-
ularization terms can be added for the user and tag factors. In this way, the extracted
data characteristics are consistent with such prior knowledge, which alleviate the
sparsity problem as well as control over the outcomes.
Combining with Eq. ( 2.13 ), we obtain the overall objective function:
(
)
, where tr
( · )
p ) ×
2
2
2
min
g
=
f
(
1 N + ʲ( ||
U
||
+||
I
||
+||
T
||
)
(2.21)
U
,
I
,
T
,C
U L U U
I L I I
T L T T
+ ʱ(
(
) +
(
) +
(
))
tr
tr
tr
2
2
2 is l
where
||
U
||
+||
I
||
+||
T
||
1 regularization term to penalize large parameters,
ʱ
and
ʲ
are weights controlling the strength of corresponding constraints.
2.3.3 Optimization and Parameter Learning Algorithms
Next we present an algorithm to solve the optimization problem. Obviously, directly
optimizing Eq. ( 2.21 ) is infeasible and we use an iterative optimization algorithm.
To begin with, we first provide the following theorem:
C
Theorem 1
, respectively.
We propose an alternating learning algorithm (ALA) to learn the factors by itera-
tively optimizing each subproblems. According to Theorem 1, each subproblem has
a unique solution. In practice, as g is convex w.r.t. I , it is also convex w.r.t. each
i m . 7 Therefore, when performing optimization on I , we optimize one row i m at a
g is strictly convex w.r.t. U, I, T and
time with other rows i 1 ,...,
i r I fixed. We prove that the learning
i m 1 ,
i m + 1 ,...,
algorithm has a good convergence property.
Theorem 2 The alternating learning algorithm converges to a local optimum.
The proof of Theorem 1 directly follows the regularized matrix factorization [ 18 ]
and is omitted here. We provide the proof of Theorem 2 in Appendix A. With the
7 The user factor U and tag factor T are the same cases as the image factor I .
 
Search WWH ::




Custom Search