Database Reference
In-Depth Information
6 {1,1}:{26,54} 1
7 {1,1}:{29,63} 1
8 {1,1}:{25,101.07} 1
9 {1,1}:{32,41.05} 1
10 {1,1}:{32,0} 6
The output consists of the km_coord table. This table contains the coordinates
for each point id ( pid ), the customer_id , and the assigned cluster ID ( cid ).
The coordinates ( coords ) are stored as sparse vectors. Sparse vectors are useful
when values in an array are repeated many times. For example, {1,200,3}:{1,0,1}
represents the following vector containing 204 elements, {1,0,0,…0,1,1,1}, where
the zeroes are repeated 200 times.
The coordinates for each cluster center or centroid are stored in the SQL table
km_center .
SELECT *
FROM km_centers
ORDER BY coords
cid coords
6 {1,1}:{44.1131730722154,6.31487804161302}
1 {1,1}:{39.8000419034649,61.6213603286732}
4 {1,1}:{39.2578830823738,167.758556117954}
5 {1,1}:{40.9437092852768,409.846906145043}
3 {1,1}:{42.3521947160391,1150.68858851676}
2 {1,1}:{41.2411873840445,4458.93716141001}
Because the age values are similar for each centroid, it appears that the sales values
dominated the distance calculations. After visualizing the clusters, it is advisable to
repeat the analysis after rescaling, as discussed in Chapter 4.
Search WWH ::




Custom Search