VisMiner Reference by Task - Visual Data Mining: The VisMiner Approach

Databases Reference

In-Depth Information

Boundary plot - is most useful for detecting patterns with respect to political

boundaries. Current boundaries supported include US state, US county,

three-digit zip code, and five-digit zip code. If your data is summarized,

or may be summarized via aggregation, by any of these political boundaries,

then use the boundary plot to visualize patterns based on geographic location.

Location plot - is most useful for detecting patterns with respect to geo-

graphic point locations encoded via latitude and longitude. If your dataset

contains location information, such as an address, but does not include

latitude and longitude, you can add latitude and longitude using external

geocoding tools or join your dataset with datasets containing these values.

Model Building - Algorithm Application

To create a model using one of the available data mining algorithms, drag the

modeler (data mining algorithm) over the target dataset and drop. Before doing

this, however, be sure that the dataset is ready for processing. The modeler will

use all observations and all attributes contained in the dataset. If you don't want

to use all of the data, first create a subset of the data, eliminating any

unnecessary or unwanted attributes and observations.

Choose a modeler based on the objectives of your data mining and the

capabilities of the modelers. The features of the available modelers are summa-

rized in Table A.1. They are divided into three categories: cluster analysis,

classification (prediction of nominal value), and regression (prediction of numeric

value). Cluster analysis is oriented more toward dataset preparation (sub-popu-

lation extraction) than a data mining end point.When conducting classification or

regression modeling, it is a good idea to apply multiple modelers to compare the

performance results of each. No single modeler works best across all datasets.

Model Evaluation

Once generated, data mining models should be studied and evaluated from two

perspectives:

How well does the model performs with respect to training, validation, and

test datasets?

What is the nature of the relationships between inputs and the output

variable?

The evaluation approach employed varies with respect to the data mining

objective (classification, regression, or cluster analysis) and the algorithm used

to build the model.

Search WWH ::

Custom Search

Home