Dimensionality Reduction - Mining of Massive Datasets

Database Reference

In-Depth Information

measure of importance we must use is the square of the Frobenius norm, that is, the sum

of the squares of the elements of the row or column. Let the square of the Frobenius

norm of M . Then each time we select a row, the probability p i with which we select row i is

Each time we select a column, the probability q j with which we select column j is

EXAMPLE 11.12 Let us reconsider the matrix M from Fig. 11.6 , which we repeat here as

Fig. 11.12 . The sum of the squares of the elements of M is 243. The three columns for the

science-fiction movies The Matrix , Alien , and Star Wars each have a squared Frobenius

norm of 1 2 + 3 2 + 4 2 + 5 2 = 51, so their probabilities are each 51/243 = .210. The remaining

two columns each have a squared Frobenius norm of 4 2 + 5 2 + 2 2 = 45, and therefore their

probabilities are each 45/243 = .185.

Figure 11.12 The matrix M , repeated from Fig. 11.6

The seven rows of M have squared Frobenius norms of 3, 27, 48, 75, 32, 50, and 8, re-

spectively. Thus, their respective probabilities are .012, .111, .198, .309, .132, .206, and

.033.

□

Now let us select r columns for the matrix C . For each column, we choose randomly

from the columns of M . However, the selection is not with uniform probability; rather, the

j th column is selected with probability q j . Recall that probability is the sum of the squares

of the elements in that column divided by the sum of the squares of all the elements of the

matrix. Each column of C is chosen independently from the columns of M , so there is some

chance that a column will be selected more than once. We shall discuss how to deal with

this situation after explaining the basics of CUR-decomposition.

Having selected each of the columns of M , we scale each column by dividing its ele-

ments by the square root of the expected number of times this column would be picked.

That is, we divide the elements of the j th column of M , if it is selected, by The scaled

column of M becomes a column of C .

Rows of M are selected for R in the analogous way. For each row of R we select from

the rows of M , choosing row i with probability p i . Recall p i is the sum of the squares of the

elements of the i th row divided by the sum of the squares of all the elements of M . We then

scale each chosen row by dividing by if it is the i th row of M that was chosen.

Search WWH ::

Custom Search

Home