Java Reference
In-Depth Information
For regression models, JDM defines two possible contents:
predicted value and confidence . Unlike classification, regression predicts
a continuous value. It provides a confidence level associated with the
predicted value indicating the prediction quality. Generally, the more
narrow the confidence band, the more accurate the prediction. Confi-
dence values are represented as a percentage value or as a value
between 0 and 1, where 1 is the highest confidence and 0 the lowest.
For more details about confidence, refer to [Wikipedia-Confidence
2006].
For clustering models, JDM defines four possible contents: cluster
identifier, probability, quality-of-fit , and distance . The clustering apply
operation computes one or more of these contents for each case and
cluster combination. For example, when a clustering model has three
clusters, the apply operation can compute the probability, quality-of-
fit, and distance for each case with respect to each of the three clus-
ters. The cluster probability is a measure of the degree of confidence
by which a case belongs to a given cluster. The quality-of-fit indicates
how well a case fits the cluster. The distance indicates how “far” a
given case is from the cluster centroid.
Sections 9.4, 9.5, and 9.8 detail the use of these contents in apply
settings.
8.1.3
Models
A model object contains a compact representation of the knowledge
contained in the build data, providing details at the function and
algorithm level, as well as the build settings used to create the model.
Model is the base interface for all types of models. In JDM, each min-
ing function has an associated model object. A model may also con-
tain algorithm-specific representations implemented through the
base interface ModelDetail , which encapsulates algorithm-specific
details. For example, a model that is built using ClassificationSettings
with the decision tree algorithm has an instance of ClassificationModel
with TreeModelDetail . Here, the ClassificationModel instance provides
content that is common across all classification algorithms, such as
the attributes used by the model build, the target attribute used, and
the settings used to build the model. TreeModelDetail provides the
model details specific to the decision tree algorithm, such as the list of
tree nodes, their hierarchical structure, and node details like pre-
dicted target and probability.
Search WWH ::




Custom Search