Uncertain Frequent Pattern Mining - Frequent Pattern Mining

Database Reference

In-Depth Information

Table 14.7 Streaming

uncertain data D 3

Batch ID

Set of items with existential probability

B 1

{ t 1 ={ f :0.7},

t 2 ={ f :0.7, g :0.9, h :0.7, i :0.5},

t 3 ={ f :0.7, g :0.9, h :0.8}}

B 2

{ t 4 ={ f :0.7, g :0.9},

t 5 ={ g :0.6},

t 6 ={ g :0.6}}

B 3

{ t 7 ={ f :0.7, g :0.9, h :0.7, i :0.5},

t 8 ={ f :0.7, g :0.9, h :0.8},

t 9 ={ f :0.7, g :0.9, i :0.5}}

and

reduce:

{ x }

,

{ x }

-proj. DB

→

list of

frequent k -itemset X , expSup ( X , D )

.

(14.9)

To recap, by using the above two sets of “map” and “reduce” functions, the MR-

growth (i) first finds all frequent 1-itemsets with their expected support and (ii) then

buids appropriate local UF-trees (for projected databases) to find all frequent k -

itemsets (for k

≥

2) with their expected support.

8

Streaming Uncertain Frequent Pattern Mining

In addition to static probabilistic datasets of uncertain data, dynamic streams of

uncertain data can also be generated (e.g., by wireless sensors) in many real-life

applications (e.g., environment surveillance). This leads to stream mining [ 22 ].

8.1

SUF-growth

To mine frequent patterns from streams of uncertain data, Leung and Hao [ 36 ]

extended the UF-growth algorithm (ref. Sect. 5.1) to produce an exact mining al-

gorithm called SUF-growth . During the mining process, the (i) sliding window

model, (ii) time-fading model, or (iii) landmark model is commonly used in pro-

cessing batches of transactions in the data streams. When using a sliding window

model, SUF-growth captures only the contents of streaming data in batches of trans-

actions belonging to the current window (that captures the recent w batches) in a tree

structure called SUF-tree . When the window slides, SUF-growth removes from the

SUF-tree those data belonging to older batches and adds to the SUF-tree those data

belonging to newer batches. Hence, each tree node in the SUF-tree consists of three

components: (i) an item, (ii) its existential probability, and (iii) a list of its w occur-

rence counts in the path. By doing so, when the window slides, the oldest occurrence

counts (representing the oldest streaming data) are replaced by the newest occurrence

Frequent Pattern Mining

Search WWH ::

Custom Search

Home