Databases Reference
In-Depth Information
In preparing datasets for classification, consider also the cardinality of
nominal attributes. Normally for classification modeling, VisMiner supports
attributes with a maximum cardinality of 10. The only exception is for decision
trees where the maximum is bumped up to 30. Any attempts by the user to create
models from datasets with cardinalities above the maximum allowed will be
rejected.
Tutorial - Building and Evaluating Classification Models
We begin with a simple dataset - Iris.csv which was explored in Chapter 2. Our
objective is to build a model that correctly classifies iris varieties based on the
four flower measurements: petal width, petal length, sepal width, and sepal
length.
In the VisMiner Control Center, open the Iris.csv dataset.
In VisMiner, modeling algorithms are applied by dragging the desired
modeler implementing the algorithm down over the target dataset. To start,
drag the “Dec Tree Classifier” down to the Iris dataset and release.
As the modeler processes the dataset, the gears of the modeler turn. You know
that it has finished when the gears stop and then disappear. Because the dataset
is so small and the decision tree algorithm is quite simple, the modeler will
finish almost immediately. When you process large datasets, depending on the
algorithm, it may take minutes or sometimes even hours to complete. While the
modeler is processing, the Control Center is still active. You may perform other
operations while you wait, such as preparation of different datasets or explora-
tion using the data viewers.
In the case of the Iris dataset, the classification modeler immediately begins
processing as soon as the modeler is released over the dataset. This is because
there is only one nominal attribute in the dataset (variety). In datasets
where there are multiple nominal attributes, you will first be required to identify
the output (classification) attribute before processing begins.
Model Evaluation
After processing is complete, it is time to evaluate the resulting model. There
are two objectives of the evaluation:
How well does the model perform?
How do inputs contribute to model predictions?
 
Search WWH ::




Custom Search