Database Reference
In-Depth Information
The relationship between PCA and SVD
We mentioned earlier that there is a close relationship between PCA and SVD. In fact, we
can recover the same principal components and also apply the same projection into the
space of principal components using SVD.
In our example, the right singular vectors derived from computing the SVD will be equi-
valent to the principal components we have calculated. We can see that this is the case by
first computing the SVD on our image matrix and comparing the right singular vectors to
the result of PCA. As was the case with PCA, SVD computation is provided as a function
on a distributed RowMatrix :
val svd = matrix.computeSVD(10, computeU = true)
println(s"U dimension: (${svd.U.numRows}, ${svd.U.numCols})")
println(s"S dimension: (${svd.s.size}, )")
println(s"V dimension: (${svd.V.numRows}, ${svd.V.numCols})")
We can see that SVD returns a matrix U of dimension 1055 x 10, a vector S of the singular
values of length 10 , and a matrix V of the right singular vectors of dimension 2500 x 10:
U dimension: (1055, 10)
S dimension: (10, )
V dimension: (2500, 10)
The matrix V is exactly equivalent to the result of PCA (ignoring the sign of the values and
floating point tolerance). We can verify this with a utility function to compare the two by
approximately comparing the data arrays of each matrix:
def approxEqual(array1: Array[Double], array2:
Array[Double], tolerance: Double = 1e-6): Boolean = {
// note we ignore sign of the principal component /
singular vector elements
val bools = array1.zip(array2).map { case (v1, v2) => if
(math.abs(math.abs(v1) - math.abs(v2)) > 1e-6) false else
true }
bools.fold(true)(_ & _)
}
We will test the function on some test data:
Search WWH ::




Custom Search