Database Reference
In-Depth Information
The major contribution of the change-detection methodology as described, is
the introduction of a new methodology for change detection and the implication of
the eight possible changes in the data-mining models. The implication of this
novel method in the field of data stream segmentation is described. These notions
are defined, and a specific change-detection procedure is designed to solve the
change-detection problem, based on a set of statistical hypothesis testing. Also, the
methodology was implemented as a new method of finding segmentation in a data
stream.
As change detection is quite a new application area in the field of classification
models of data mining, many issues are left to be investigated:
1. Implementing voting techniques according to the cause(s) and magnitude(s) of
a change(s) detected in period
K
for combining several models, such as
exponential smoothing, voting weights based on the CD confidence level,
neglecting old periods or problematic periods, etc.
2. Integrating the change-detection methodology in an existing data-mining
method. As seen, the change-detection procedure's complexity is only O(
n
).
One could integrate this procedure with an existing incremental learning
algorithm, which will continue efficiently rebuilding an existing model if the
procedure detects a significant change in the newly obtained data. This option is
also applicable for meta-learning and combining methods.
3. Implementing the methodology in a new search algorithm for finding an
optimal segmentation of a data stream with respect to a variety of data-mining
models.
4. Using the CD statistical hypothesis testing for specific attribute monitoring.
References
[1]
Ali, K., and Pazzani, M., Learning multiple descriptions to improve
classification accuracy,
International Journal on Artificial Intelligence Tools
,
4, 1995.
[2]
Case, J. et al., Incremental concept learning for bounded data mining,
Proc.
Automata Induction, Grammatical Inference, and Language Acquisition,
Workshop at the 14th International Conference on Machine Learning (ICML-
97), 1997, and
Information and Computation
, 152(1), 74-110, 1999.
[3]
Chan, P. K., and Stolfo, S. J., A comparative evaluation of voting and meta-
learning on partitioned data, in
Proc. 12th Intl. Conf. on Machine Learning
,
90-8, 1995.
[4]
Chan, P. K., and Stolfo, S. J., Sharing learned models among remote database
partitions by local meta-learning, in
Proc. 2nd Intl. Conf. on Knowledge
Discovery and Data Mining
, 2-7, 1996.
[5]
Cheung, D., Han, J., Ng, V., and Wong, C. Y., Maintenance of discovered
association rules in large databases: An incremental updating technique, in
Proc. of 1996 Int'l Conf. on Data Engineering
(ICDE'96), New Orleans, LA,
1996.
Search WWH ::
Custom Search