Database Reference
In-Depth Information
Extracting the right features from your
data
Like most of the machine learning models we have encountered so far, K-means clustering
requires numerical vectors as input. The same feature extraction and transformation ap-
proaches that we have seen for classification and regression are applicable for clustering.
As K-means, like least squares regression, uses a squared error function as the optimization
objective, it tends to be impacted by outliers and features with large variance.
As for regression and classification cases, input data can be normalized and standardized to
overcome this, which might improve accuracy. In some cases, however, it might be desir-
able not to standardize data, if, for example, the objective is to find segmentations accord-
ing to certain specific features.
Search WWH ::




Custom Search