CHANGE DETECTION IN CLASSIFICATION MODELS INDUCED FROM TIME SERIES DATA - Data Mining in Time Series Databases

Database Reference

In-Depth Information

•

Adapting to Concept and Population Drift [14,18,19,21,34].

•

Activity Monitoring [9].

Rather than challenging the problem of detecting significant changes, the

above methods deal directly with data mining in changing environment.

This chapter introduces a novel methodology for detecting a significant

change in a classification model of data mining, by identifying distinct cat-

egories of changes and implementing a set of statistical estimators. The

major contribution of our change detection procedure (as will be described

in later sections), is the ability to make a confident claim that the model

that was pre-built based on a suciently large dataset is no longer valid for

any future use such as prediction, rule induction, etc., and consequently, a

new model must be constructed.

The rest of the chapter is organized as follows. In Section 2, we present

our change detection procedure in data mining models, by defining the data

mining classification model characteristics (2.1), describing the variety of

possible significant changes (2.2), definition of the hypothesis testing (2.3),

and the methodology for the change detection procedure (2.4). In Section 3,

we describe an experimental evaluation of the change detection method-

ology by introducing artificial changes in databases and implementing the

change detection methodology to identify these changes. Section 4 describes

evaluation of the change detection methodology on real-world datasets.

Section 5 presents validation of the work's assumptions and Section 6 con-

cludes this chapter by summing-up the main contributions of our method,

and presenting several options for future research in implementation and

extension of our methodology.

2. Change Detection in Classification Models

of Data Mining

2.1. Classification Model Characteristics

Classification is the task which involves constructing a model predicting the

(usually categorical) label of a classifying attribute and using it to classify

new data. The induced classification model can be represented as a set of

rules (see the RISE system [8], the BMA method for rule induction [6],

PARULEL and PARADISER etc.), a decision tree (see [34-36] ), neural

networks (see [35]), information-theoretic connectionists networks (see IFN

[25]) and so on.

Given a database

D

which consist of

X

set of records. The data mining

model

M

is one that by using an algorithm (

G

) generates a set of hypothesis

Data Mining in Time Series Databases

Search WWH ::

Custom Search

Home