Database Reference
In-Depth Information
1. Introduction
As mass of data is incrementally accumulated into large databases over
time, we tend to believe that the new data “acts” somehow resembling to
the prior knowledge we have on the operation or facts that it describes.
Change detection in time series is not a new subject and it has always been
a topic of continued interest. For instance, Jones et al . [17] have developed
a change detection model mechanism for serially correlated multivariate
data. Yao [39] has estimated the number of change points in time series
using the BIC criterion. However the change detection in classification is
still not well elaborated.
There are many algorithms and methods that deal with the incremental
learning problem, which is concerned with updating an induced model upon
receiving new data. These methods are specific to the underlying data min-
ing model. For example: Utgoff's method for incremental induction of deci-
sion trees (ITI) [35,36], Wei-Min Shen's semi-incremental learning method
(CDL4) [34], David W. Cheung technique for updating association rules in
large databases [5], Alfonso Gerevini's network constraints updating tech-
nique [12], Byoung-Tak Zhang's method for feedforwarding neural networks
(SELF) [40], simple Backpropagation algorithm for neural networks [27],
Liu and Setiono's incremental feature selection (LVI) [24] and more.
The main topic in most incremental learning theories is how the model
(this could be a set of rules, a decision tree, neural networks, and so on) is
refined or reconstructed eciently as new amounts of data is encountered.
This problem has been challenged by many of the algorithms mentioned
above, and many of them performed significantly better than running the
algorithm from scratch, generally when the records were received on-line
and changes had a low magnitude. An important question that one must
examine whenever a new mass of data is accumulated is “Is it really wise to
keep on re-constructing or verifying the current model, when everything or
something in our notion of the model could have significantly changed?” In
other words, the main problem is not how to reconstruct better, but rather
how to detect a change in a model based on a time series database.
Some researchers have proposed various representations of the problem
of large time changing databases and populations including:
Defining robustness and discovering robust knowledge from data-
bases [15] or learning stable concepts [13] in domains with hidden changes
in concepts.
Identifying and modeling a persistent drift in a database [11].
Search WWH ::




Custom Search