values. As we discuss in Sections 4.2 and 4.3, classification and
regression also differ in how model quality is assessed.
Unsupervised functions do not use a target, and are typically
used to find the intrinsic structure, relations, or affinities in a dataset.
Unlike supervised mining functions, which predict an outcome,
unsupervised mining functions cover a wide range of analytical
capabilities including clustering and association. Clustering may be
used to identify naturally occurring groups in the data, for example,
similar proteins or cancer cells, or retail customer segments. Associ-
ation models return items and rules that can be used, for example,
to identify products for cross-sell to retail customers. There are
other unsupervised mining functions, such as sequential patterns
2005], feature extraction [Lui
1998], and anomaly detec-
2005], not yet officially covered by JDM.
Another dimension of data mining involves whether the result-
ing model is descriptive or predictive. Descriptive data mining
describes a dataset in a concise, enlightening, and summary manner,
and presents interesting generalized properties of the data. Descriptive
data mining results in models that provide transparency , that is, the
ability to understand why the model behaves as it does. The extent to
which a model is descriptive often depends on the algorithm used to
produce it. For example, in supervised learning, decision trees typi-
cally provide human interpretable rules that explain why a given
prediction was made, whereas a neural network used on the same
data provides no readily discernable understanding, and are consid-
ered “black-box” or opaque . Most unsupervised mining functions,
such as clustering or association rules, by definition are descriptive.
Mining functions and algorithms can be both descriptive and predic-
tive, such as the decision tree algorithm.
Predictive mining functions perform inference on the available
data, and attempt to predict outcomes or assignments for new data.
In addition to the prediction, predictive mining functions may also
provide a probability or confidence as to how strong the prediction
is based on what the model knows. For example, a clustering model
may assign a case to Cluster 5 with a 95 percent probability (a
strong assignment), or a classification model may predict the
customer will churn with a 55 percent probability (a weak predic-
tion). Supervised mining functions by definition are predictive.
Supervised algorithms supporting predictive data mining include
naïve bayes, neural networks, support vector machine, and deci-
sion trees. The clustering mining function, with algorithms such as
k-means, may also be considered predictive when used to assign