Segmentation of Continuous Data Streams Based on a Change Detection Methodology - Advanced Techniques in Knowledge Discovery and Data Mining

Database Reference

In-Depth Information

Change-Detection Procedure

Stage 1:

For periods

build

using DM algorithm G.

Define

val

)

Count

val

)

Calculate

according to V.

M K

K n for every candidate and target

variable existing in periods (1,...,

Calculate

Stage 2:

For period K , define

Count

Calculate

according to V.

M K

ˆ d

ABS(

)

V ,

Calculate

Calculate and return CD(Į).

Stage 3 :

For every candidate and target variable existing in periods

(1,...,

) and in period K calculate:

, and

Calculate and return XP(Į).

It is obvious that the complexity of this procedure is at most O( n K ). It is very

easy to store information about the various distributions of the target and candidate

variables to simplify the change-detection methodology.

Using the outputs of the methodology the user can make a distinction among

the eight possible variations of a change in the data-mining classification model.

According to this new information the user of the new methodology can act in

several ways : For example, the user can reapply the algorithm from scratch

absorbing the new period and using the same incremental algorithm, making

c KK and performing the procedure again. The user can also investigate the

type of the change and its magnitude and effect on the other characteristics of the

DM model, and incorporate other known methods dealing with the specific

detected changes. One can also incorporate multiple-model approaches such as

weighting, arbitrage, and combining methods, and use the prior knowledge of the

change.

The methodology is not restricted to databases with a constant number of

variables. The basic assumption is that if the addition of a new variable influences

the connection between that target variable and the candidate variables in a

manner that inflicts on the validation accuracy (V is the method to select

Advanced Techniques in Knowledge Discovery and Data Mining

Search WWH ::

Custom Search

Home