Database Reference
In-Depth Information
✦
Dimensionality Reduction by PCA
: By representing the matrix of points by a small number of its eigenvectors, we
can approximate the data in a way that minimizes the root-mean-square error for the given number of columns in the
representing matrix.
✦
Singular-Value Decomposition
: The singular-value decomposition of a matrix consists of three matrices,
U
, Σ, and
V
. The matrices
U
and
V
are column-orthonormal, meaning that as vectors, the columns are orthogonal, and their
lengths are 1. The matrix Σ is a diagonal matrix, and the values along its diagonal are called singular values. The
product of
U
, Σ, and the transpose of
V
equals the original matrix.
✦
Concepts
: SVD is useful when there are a small number of concepts that connect the rows and columns of the ori-
ginal matrix. For example, if the original matrix represents the ratings given by movie viewers (rows) to movies
(columns), the concepts might be the genres of the movies. The matrix
U
connects rows to concepts, Σ represents
the strengths of the concepts, and
V
connects the concepts to columns.
✦
Queries Using the Singular-Value Decomposition
: We can use the decomposition to relate new or hypothetical rows
of the original matrix to the concepts represented by the decomposition. Multiply a row by the matrix
V
of the de-
composition to get a vector indicating the extent to which that row matches each of the concepts.
✦
Using SVD for Dimensionality Reduction
: In a complete SVD for a matrix,
U
and
V
are typically as large as the
original. To use fewer columns for
U
and
V
, delete the columns corresponding to the smallest singular values from
U
,
V
, and Σ. This choice minimizes the error in reconstructing the original matrix from the modified
U
, Σ, and
V
.
✦
Decomposing Sparse Matrices
: Even in the common case where the given matrix is sparse, the matrices constructed
by SVD are dense. The CUR decomposition seeks to decompose a sparse matrix into sparse, smaller matrices whose
product approximates the original matrix.
✦
CUR Decomposition
: This method chooses from a given sparse matrix a set of columns
C
and a set of rows
R
, which
play the role of
U
and
V
T
in SVD; the user can pick any number of rows and columns. The choice of rows and
columns is made randomly with a distribution that depends on the Frobenius norm, or the square root of the sum of
the squares of the elements. Between
C
and
R
is a square matrix called
U
that is constructed by a pseudoinverse of
the intersection of the chosen rows and columns.
11.6 References for Chapter 11
Principal-component analysis was first discussed over a century ago, in [
6
].
SVD is from [
3
]. There have been many applications of this idea. Two worth mentioning
[1] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman, “Indexing by latent semantic analys-
is,”
J. American Society for Information Science
41:6 (1990).
[2] P. Drineas, R. Kannan, and M.W. Mahoney, “Fast Monte Carlo algorithms for matrices III: Computing a com-
pressed approximate matrix decomposition,”
SIAM J. Computing
3
6
:1 (2006), pp. 184-206.
[3] G.H. Golub and W. Kahan, “Calculating the singular values and pseudo-inverse of a matrix,”
J. SIAM Series B
2:2 (1965), pp. 205-224.
[4] G.H. Golub and C.F. Van Loan,
Matrix Computations
, JHU Press, 1996.
[5] M.W. Mahoney, M. Maggioni, and P. Drineas, Tensor-CUR decompositions for tensor-based data,
SIGKDD
, pp.
327-336, 2006.
[6] K. Pearson, “On lines and planes of closest fit to systems of points in space,”
Philosophical Magazine
2:11 (1901),
pp. 559-572.