Frequent Pattern Mining in Data Streams - Frequent Pattern Mining

Database Reference

In-Depth Information

Chapter 9

Frequent Pattern Mining in Data Streams

Victor E. Lee, Ruoming Jin and Gagan Agrawal

Abstract As the volume of digital commerce and communication has exploded, the

demand for data mining of streaming data has likewise grown. One of the fundamental

data mining tasks, for both static and streaming data, is frequent pattern mining. The

goal of pattern mining is to identity frequently occurring patterns and structures.

Such patterns may indicate scientific phenomena, economic or social trends, or even

security threats. Moreover, not only is pattern discovery important by itself, but it is

also a building block for machine learning tasks such as association rule induction.

Traditionally, algorithms for pattern discovery have processed the entire dataset as a

batch, with no restriction on how many passes through the data would be taken.

However, when the data are arriving in a continuous and unending stream, our

algorithm must be limited to a single pass. Moreover, the length of the stream is

indeterminate, so we cannot wait for it to end. We generate an initial result after seeing

a certain quantity of data, and then we periodically revise the result. A particular

challenge for frequent pattern discovery is the combinatorial explosion of candidate

patterns

In this chapter, we present a structured review of online frequent pattern mining

techniques. We classify the methods according to the type of pattern and data, the

time window being considered, and the quality of the approximation.

Keywords Frequent pattern mining

·

Streaming data

·

Lossy counting

·

Sliding

window

Search WWH ::

Custom Search

Home