Java Reference
In-Depth Information
Table 9-20
(continued)
mapClusters(
ClusteringApplyContent content,
String baseDestPhysAttrName)
Maps all clusters in the model and the specified content
value to a set of named destination attributes. When this
method is used, the apply output data will have apply con-
tents for all the leaf clusters. The base attribute name speci-
fied by the user will be used to generate the columns in the
apply output data. For example, when a user calls the fol-
lowing methods where the input model has four leaf clusters
i.e., { 1, 2, ,3 ,4 }, the apply task creates apply output data
with columns ClusterId_1, ClusterId_2, ClusterId_3,
ClusterId_4, Probability_1, Probability_2,
Probability_3, Probability_4. The column Probability_1
has the probability value associated with the cluster id value
in column ClusterId_1. Similarly, the other columns will
have cluster ids and associated probabilities.
mapClusters
(ClusteringApplyContent.clusterIdentifier,
“ClusterId”);
mapClusters
(ClusteringApplyContent..probability,
“Probability”);
Listing 9-17 shows the code that illustrates the use of clustering
interfaces for the customer segmentation problem discussed in
Section 7.5. Lines 34 to 41 show the creation of the clustering settings
object that specifies the aggregation function as euclidean and the
attribute comparison function for age attribute as absolute difference in
values. All other attributes use the DME's default attribute compari-
son function. In addition, the maximum number of clusters is speci-
fied as 50 and the cluster case count must be between 500 and 100,000
cases. Building this segmentModel is similar to building the other
types of models, as shown from lines 69 to 71. Once the segmentModel
is built, we apply this model to the apply input data to find the most
probable cluster id using the ClusterApplySettings.mapTopCluster
method. Lines 47 to 53 show the creation of the apply settings object
and lines 74 to 79 show the execution of the dataset (batch) apply
task. Similar to classification and regression, clustering models can
also support real-time single record apply operations. Lines 95 to 119
show retrieving the clustering model and each cluster's details. In
this example, we show retrieving the age attribute statistics details
such as frequencies and how applications can obtain further cluster
details from the model.
 
Search WWH ::




Custom Search