Database Reference
In-Depth Information
right transformation at the beginning, we may obtain a surprising effect
that hints to us about the transformation needed. Thus, the process
reflects upon itself and leads to an understanding of the transformation
needed. Having completed the above four steps, the following four
steps are related to the Data Mining part where the focus is on the
algorithmic aspects employed for each project.
5. Choosing the appropriate Data Mining task. We are now ready
to decide which task of Data Mining would fit best our needs, i.e.
classification, regression, or clustering. This mostly depends on the
goals and the previous steps. There are two major goals in Data
Mining: prediction and description. Prediction is often referred to as
supervised Data Mining, while descriptive Data Mining includes the
unsupervised classification and visualization aspects of Data Mining.
Most data mining techniques are based on inductive learning where
a model is constructed explicitly or implicitly by generalizing from a
sucient number of training examples. The underlying assumption of
the inductive approach is that the trained model is applicable to future
cases. The strategy also takes into account the level of meta-learning
for the particular set of available data.
6. Choosing the Data Mining algorithm. Having mastered the strat-
egy, we are able to decide on the tactics. This stage includes selecting
the specific method to be used for searching patterns. For example, in
considering precision versus understandability, the former is better with
neural networks, while the latter is better with decision trees. Meta-
learning focuses on explaining what causes a Data Mining algorithm to
be successful or unsuccessful when facing a particular problem. Thus,
this approach attempts to understand the conditions under which a
Data Mining algorithm is most appropriate.
7. Employing the Data Mining algorithm. In this step, we might
need to employ the algorithm several times until a satisfied result is
obtained. In particular, we may have to tune the algorithm's control
parameters such as the minimum number of instances in a single leaf
of a decision tree.
8. Evaluation. In this stage, we evaluate and interpret the extracted
patterns (rules, reliability, etc.) with respect to the goals defined in the
first step. This step focuses on the comprehensibility and usefulness
of the induced model. At this point, we document the discovered
knowledge for further usage.
Search WWH ::




Custom Search