Databases Reference
In-Depth Information
1
Introduction
Data mining has been defined as the search for useful and previously unknown
patterns in large datasets. Yet when faced with the task of mining a large
dataset, it is not always obvious where to start and how to proceed. The purpose
of this topic is to introduce a methodology for data mining and to guide you in
the application of that methodology using software specifically designed
to support the methodology. In this chapter, we provide an overview of the
methodology. The chapters that follow add detail to that methodology and
contain a sequence of exercises that guide you in its application. The exercises
use VisMiner, a powerful visual data mining tool which was designed around
the methodology.
Data Mining Objectives
Normally in data mining a mathematical model is constructed for the purpose of
prediction or description . A model can be thought of as a virtual box that
accepts a set of inputs, then uses that input to generate output.
Prediction modeling algorithms use selected input attributes and a single
selected output attribute from your dataset to build a model. The model, once
built, is used to predict an output value based on input attribute values.
The dataset used to build the model is assumed to contain historical data
from past events in which the values of both the input and output attributes are
known. The data mining methodology uses those values to construct a model
that best fits the data. The process of model construction is sometimes referred
to as training . The primary objective of model construction is to use the model
for predictions in the future using known input attribute values when the value
 
Search WWH ::




Custom Search