Database Reference
In-Depth Information
Evaluating dimensionality reduction
models
Both PCA and SVD are deterministic models. That is, given a certain input dataset, they
will always produce the same result. This is in contrast to many of the models we have seen
so far, which depend on some random element (most often for the initialization of model
weight vectors and so on).
Both models are also guaranteed to return the top principal components or singular values,
and hence, the only parameter is k . Like clustering models, increasing k always improves
the model performance (for clustering, the relevant error function, while for PCA and SVD,
the total amount of variability explained by the k components). Therefore, selecting a value
for k is a trade-off between capturing as much structure of the data as possible while keep-
ing the dimensionality of projected data low.
Search WWH ::




Custom Search