Java Reference

In-Depth Information

For regression models, JDM defines two possible contents:

predicted value
and
confidence
. Unlike classification,
regression
predicts

a continuous value. It provides a
confidence level
associated with the

predicted value
indicating the prediction quality. Generally, the more

narrow the confidence band, the more accurate the prediction. Confi-

dence values are represented as a percentage value or as a value

between 0 and 1, where 1 is the highest confidence and 0 the lowest.

For more details about confidence, refer to [Wikipedia-Confidence

2006].

For clustering models, JDM defines four possible contents:
cluster

identifier, probability, quality-of-fit
, and
distance
. The clustering apply

operation computes one or more of these contents for each case and

cluster combination. For example, when a clustering model has three

clusters, the apply operation can compute the probability, quality-of-

fit, and distance for each case with respect to each of the three clus-

ters. The cluster probability is a measure of the degree of confidence

by which a case belongs to a given cluster. The quality-of-fit indicates

how well a case fits the cluster. The distance indicates how “far” a

given case is from the cluster centroid.

Sections 9.4, 9.5, and 9.8 detail the use of these contents in apply

settings.

8.1.3

Models

A model object contains a compact representation of the knowledge

contained in the build data, providing details at the function and

algorithm level, as well as the build settings used to create the model.

Model
is the base interface for all types of models. In JDM, each min-

ing function has an associated model object. A model may also con-

tain algorithm-specific representations implemented through the

base interface
ModelDetail
, which encapsulates algorithm-specific

details. For example, a model that is built using
ClassificationSettings

with the decision tree algorithm has an instance of
ClassificationModel

with
TreeModelDetail
. Here, the
ClassificationModel
instance provides

content that is common across all classification algorithms, such as

the attributes used by the model build, the target attribute used, and

the settings used to build the model.
TreeModelDetail
provides the

model details specific to the decision tree algorithm, such as the list of

tree nodes, their hierarchical structure, and node details like pre-

dicted target and probability.

Search WWH ::

Custom Search