Databases Reference
In-Depth Information
Self-organizing map
Statistics computed for dataset clusterings measure cluster cohesiveness
(how similar are the observations within a cluster) and separation (how
distinct is a cluster from other clusters in the clustering set). All are available
in the SOM viewer.
1. Mean squared error (MSE) - a measure of cluster cohesion. The MSE
magnitude is only meaningful relative to the MSE of other clusters within
the clustering or other clusterings of the same dataset.
2. Silhouette coefficient - a combined measure of both cohesion and
separation. Its range is [ 1.0, 1.0], where 1.0 is the worst possible
and 1.0 is the best possible.
3. Correlation coefficient - another combined measure of both cohesion and
separation. Its range is [ 1.0, 1.0], where 1.0 is the worst possible and
1.0 is the best possible.
Model relationships
Study the relationships between input attributes and the output to evaluate the
strength and nature of the relationship. Also look for interactions between
inputs. Interactions occur when changes in value of one input attribute affect the
nature of the contribution to output of a second input attribute. For example,
the presence of a formal dining room may not add as much value to a small
house as to a large house.
If studied carefully, the relationships will provide insights into the function-
ing of the world being modeled.
1. Surface models of the classification surface viewers and the regression
model viewers depict strength of relationships between inputs and outputs.
2. The shape of the curves at opposite edges of a surface is an indicator of
interaction between inputs if those curve shapes are different.
3. Tree graphs available for the decision tree classifier provide insights into
the importance of inputs.
a. Tree branching attributes at the top of the tree provide greater differ-
entiation between output class values than those lower on the tree.
b. The presence of attributes on one branch of a tree that do not exist on
another are an indicator of input interactions.
4. For linear and polynomial regression models only, the coefficients of
the regression summary directly represent input contributions to the output
value.
Search WWH ::




Custom Search