Database Reference
In-Depth Information
8.6 A Note on Efficient Computation of Large Elements
of Low-Rank Matrices
All of the above-described factorization-based approaches to recommendation
leave us with the computational burden of determining the largest entries of each
column (or row) of a huge low-rank matrix. Interestingly enough, the authors did
not found any publications about this fundamental problem!
A na¨ve approach would involve computing an explicit representation of the
matrix entries, which incurs O ( mn) floating point operations as well as a storage
complexity of mn , and then running a sorting algorithm for each column (or row),
which takes at least another m operations for each row. This amounts to an overall
quadratic complexity, which renders the na¨ve approach unsuitable for realtime
recommendation applications, where m and n , commonly denoting numbers of
products in a shop or numbers of customers or sessions, respectively, are
notoriously huge.
Hence, if we are to deploy factorization-based prediction in a realtime RE, we
must think of a more efficient method, or even settle for a heuristic algorithm. So
how is this to be attained?
Apparently, to achieve a subquadratic complexity, we must somehow manage to
avoid computing all of the matrix elements to begin with. In an analytical applica-
tion, where the columns are discretized versions of continuous functions with a
certain structure, the complexity issue might simply be remedied by computing
only a few entries of each column and then using an interpolation method along
with a continuous optimization procedure so as to obtain a fair estimate of the
largest entries. Since in our setting, though, there is no continuous framework
whatsoever, an interpolation-based remedy is not an option.
Let us consider a low-rank matrix A : ¼ X T Y , where X and Y are matrices of
dimensionality r m and r n, respectively. Each of the entries a ij of A is given
by the inner product x i y j of the corresponding columns of X and Y . Let us consider
one single column
i 1 ;...;m
:¼ x T y ¼ x i y
a
f
g
of A, where y denotes the corresponding column of Y and indices have been omitted
for the sake of simplicity. Inspired by the interpolation approach, we would like to
estimate the largest entries of a without computing all of the inner products. Since
there are no structural clues like continuity at hand, we are left to our own devices to
discover some exploitable structure. Yet for the whole thing to pay off, the cost of
the discovery process must not exceed its benefits.
In the following, we shall describe a tentative remedy that is based on recur-
sively clustering the columns of X . The so-called k-means clustering, sometimes
referred to as vector quantization in signal processing and related communities,
splits a set of m vectors into k disjoint clusters such that
Search WWH ::




Custom Search