A Case-Based Data Mining Platform - Data Mining: Theory, Methodology, Techniques, and Applications

Database Reference

In-Depth Information

the processing flows will be quite similar. Based on these facts, when we deal with a

new problem, we can use a similar case's processing flow as template to solve it.

At this time, it is not ready yet to take a past case's processing flow to reuse,

because the issue about how to get a right case at right time is not concerned. This

issue is a problem of similarity-based retrieval. That is, we compare the similarity

scores of new problem with the past cases, and then we select the most similar case as

the right one to help solve the new problem. For this requirement on similarity-based

retrieval, we need further to define some meaningful and comparable attributes to

calculate similarity scores. Generally, these attributes include industry type, problem

type, business objective, data mining goal, and other, which can determine a data

mining case's processing flow at a general level. For simplifying the description, we

use the term of data mining task to enclose these meaningful and comparable

attributes. Data mining task is attached on the data mining system to retrieve similar

data mining cases. It is also the third extension to generic data mining model.

Now, we can illustrate the data mining model that we have extended. As shown in

Figure 1, the central part of this data mining model is a process builder. It retrieves

similar cases based on data mining task, loads data from the data base, calls operators

from operator base, reuses processing flows to generate model(s) for new data mining

problem, and outputs model(s) to model base.

Task

Data Base

Operator

Base

Processing

Flow

Process

Builder

Model

Base

Fig. 1. Extended Data Mining Model for Knowledge Reuse

This data mining model has used the concept of case-based reasoning (CBR).

Case-based reasoning [1] is a sub-field of Artificial Intelligence (AI). It has been

widely used to solve the problems such as configuration, classification, planning,

prediction, and so on [13]. From the perspective of case-based reasoning, this data

mining model has taken knowledge retrieval and knowledge reuse into consideration,

it has also figured out the content of data mining cases. In the next section, we will

have a close look on data mining case.

3 Data Mining Case

From case-based reasoning perspective, a case is a knowledge container [9]. A case

should be defined and represented at an operable level. In this section, we will

introduce data mining case definition and representation.

Search WWH ::

Custom Search

Home