Databases Reference
In-Depth Information
5
Classification Models in VisMiner
Classification is a form of prediction modeling that uses selected input attribute
values to predict a nominal or categorical output value. In constructing a
classification model, a dataset is used that contains historical data from past
events in which the values of both the input and output attributes are known. The
classification methodology uses those values to construct a model that best fits
the data - that is the model accurately predicts the output category based on
input values. The process of model construction is sometimes referred to as
training . Once constructed and validated, the model can be used in the future to
predict the category when the input attribute values are known, but the value of
the output attribute is not yet known. For example, an insurance company may
want to build a classification model to predict if an insurance claim is likely to
be fraudulent or legitimate.
This chapter introduces the functionality of three modelers or methodologies
for classification as they are implemented in VisMiner: decision trees, artificial
neural networks, and support vector machines.
Dataset Preparation
The dataset used for classification modeling in VisMiner must be in a tabular
format. The input attributes may be of any data type - numeric, ordinal, or
nominal. The output attribute must be nominal or discrete (integer) numeric.
It is important to remember that when using VisMiner to build a classification
model, the dataset should contain only the attributes (input and output) to be
used by the modeling process. There should not be any row identifiers or other
attributes included. Therefore, if your dataset contains unneeded attributes,
before starting the modeler, create a derived set containing only the attributes
that you want to include. (See Chapters 2 and 3 for details on how this is done.)
 
Search WWH ::




Custom Search