Database Reference
In-Depth Information
mining algorithm called MOMENT is proposed, and the main idea is
based on the fact that the boundary between closed frequent itemsets
and frequent itemsets moves very slowly. A closed enumeration tree is
developed in order to keep track of the boundary between closed frequent
itemsets and the rest of the itemsets. Another method which is able to
mine frequent itemsets over arbitrary time granularities is referred to
as FPSTREAM [42]. This method is essentially an adaptation of the
FP-Treemethodtodatastreams.
Damped Window Model: We note that pure sliding windows are
not the only way by which the evolution of data streams can be taken
into account during the mining process. A second way is to introduce
a decay factor into the computation. Specifically, the weight of each
transaction is multiplied by a factor of f< 1, when a new transaction
arrives. The overall effect of such an approach is to create an exponential
decay function on the arrivals in the data stream. Such a model is quite
effective for evolving data stream, since recent transactions are counted
more significantly during the mining process. An algorithm proposed in
[27] maintains a lattice for recording the potentially frequent itemsets
and their counts. While the counts of each lattice may change upon the
arrival of each transaction, a key observation is that it is sucient to
update the counts in a lazy way. Specifically, the decay factor is applied
only to those itemsets whose counts are affected by the current trans-
action. However, the decay factor will have to be applied in a modified
way by taking into account the last time that the itemset was touched
by an update. In other words, if t c be the current transaction index, and
the last time the count for the itemset was updated was at transaction
index t s <t c , then we need to multiply the current counts of that item-
set by f t s −t c before incrementing the count of this modified value. This
approach works because the counts of each itemset reduce by the same
decay factor in each iteration, as long as a transaction count is not added
to it. We note that such a lazy approach is also applicable to other min-
ing problems, where statistics are represented as the sum of decaying
values. For example, in [11], a similar lazy approach is used in order
to maintain decay-based micro-cluster statistics for a high dimensional
projected stream clustering algorithm.
3.4 Change Detection in Data Streams
As discussed earlier, the patterns in a data stream may evolve over
time. In many cases, it is desirable to track and analyze the nature
of these changes over time. In [8, 37, 59], a number of methods have
been discussed for change detection of data streams. In addition, data
Search WWH ::




Custom Search