A Case-Based Data Mining Platform - Data Mining: Theory, Methodology, Techniques, and Applications

Database Reference

In-Depth Information

retrieved, they will be display in similar case management window. The user can

view them one by one and select a proper one as template to solve the new

problem. About the proper case, the general situation is that the most similar case

is the proper one.

5. When a proper similar case is selected as template, it will be opened in case

builder window. For the convenience in further description, the processing flow

displayed in case building window of Figure 5 will be used as example to

describe from now on.

6. In case building window, right-click on Data_1 node to invoke data connector.

With data connector, we connect current problem's input data to this platform. On

data connector, the user can set input data's location, name, type, and other

information.

7. Right-click the followed Oper_1 node to invoke operator adapter. From operator

adaptor, the user can first check this operator's usability by viewing its category,

function, and guidelines. If it is not a proper operator, the user needs to delete it

or insert a new operator ahead of it. If it is a proper operator, the user needs to

connect it by setting its path, name, input ID, parameters, and output ID. At

setting the operator's parameters, we can refer the parameter's guideline to see

how to set its value. After setting all the required values of the operator, the user

can execute it.

8. Do in the same way to check and execute the rest operator nodes. A note is that,

between two successive operators, there is a data node. This data node is the output

of former operator and is also the input of successive operator. The user can view,

or save, or export this intermediate data. At the rear part, some intermediate model

will be generated. The user can view it first and then decide to accept or discard it.

When the final model has been generated, the application scenario of model

building is end.

From this application scenario, we can see that the reusable knowledge, either the

whole processing flow, the operator guideline, or the parameter guideline, are very

helpful to solve a new data mining problem. These knowledge are worked as a

supervisor aside of the user. They can eliminate many perplexities for user, such as,

what steps should be taken, what operators should be used, how the parameters should

be set, and so on.

5 Conclusion

Data mining is a complex and time-consuming process. Data mining practice in

industry heavily depends on data mining professionals to provide solutions. In this

paper, we have proposed a case-based data mining platform, which reuses the

knowledge captured in the past data mining cases to solve new similar problems. This

platform is under developing. The XML-based data mining case representation

language has been defined, and the storage bases, the functional modules and the user

interface have been designed. From its application scenario, we can see that this

platform can eliminate many perplexities, such as, what steps should be taken, what

Search WWH ::

Custom Search

Home