Database Reference
In-Depth Information
3.3
gPLS: A Partial Least Squares Regression Approach
Saigo et al. [ 34 ] proposed gPLS , an iterative mining method based on partial least
squares regression (PLS). To apply PLS to graph data, a sparse version of PLS is
developed first and then it is combined with a weighted pattern mining algorithm.
The mining algorithm is iteratively called with different weight vectors, creating one
latent component per one mining call. Branch-and-bound search is integrated into
graph mining with a designed gain function and a pruning condition. In this sense,
gPLS is very similar to the branch-and-bound mining approach in gboost .
Partial Least Squares Regression This part is a brief introduction to partial least
squares regression (PLS). Assume there are n training examples ( x 1 , y 1 ), ... ,( x n , y n ).
The output y i is assumed to be centralized i y i =
0. Denote by X the design matrix,
where each row corresponds to x i
. The regression function of PLS is
m
α i w i x ,
f ( x )
=
i = 1
where m is the pre-specified number of components that form a subset of the original
space, and w i are weight vectors that reduce the dimensionality of x , satisfying the
following orthogonality condition,
1 i
=
j )
w i X T Xw j
=
j ) .
0 i
=
Basically w i are learned in a greedy way first, then the coefficients α i are obtained
by least squares regression without any regularization. The solutions to α i and w i are
n
y k w i x k ,
α i =
(13.5)
k = 1
and
k = 1 y k w T x k 2
w T w
w i =
arg max
w
,
subject to w T X T Xw
1, w T X T Xw j
1.
Next we present an alternative derivation of PLS called non-deflation sparse
PLS . Define the i th latent component as t i
=
=
0, j
=
1, ... , i
=
Xw i and T i 1 as the matrix of latent
components obtained so far, T i 1 =
( t 1 , ... , t i 1 ). The residual vector is computed by
r i = I T i 1 T i 1 y.
Then multiply it with X T
to obtain
η X T I T i 1 T i 1 y.
1
v
=
Search WWH ::




Custom Search