Database Reference
In-Depth Information
Table 14.7 Streaming
uncertain data D 3
Batch ID
Set of items with existential probability
B 1
{ t 1 ={ f :0.7},
t 2 ={ f :0.7, g :0.9, h :0.7, i :0.5},
t 3 ={ f :0.7, g :0.9, h :0.8}}
B 2
{ t 4 ={ f :0.7, g :0.9},
t 5 ={ g :0.6},
t 6 ={ g :0.6}}
B 3
{ t 7 ={ f :0.7, g :0.9, h :0.7, i :0.5},
t 8 ={ f :0.7, g :0.9, h :0.8},
t 9 ={ f :0.7, g :0.9, i :0.5}}
and
reduce:
{ x }
,
{ x }
-proj. DB
list of
frequent k -itemset X , expSup ( X , D )
.
(14.9)
To recap, by using the above two sets of “map” and “reduce” functions, the MR-
growth (i) first finds all frequent 1-itemsets with their expected support and (ii) then
buids appropriate local UF-trees (for projected databases) to find all frequent k -
itemsets (for k
2) with their expected support.
8
Streaming Uncertain Frequent Pattern Mining
In addition to static probabilistic datasets of uncertain data, dynamic streams of
uncertain data can also be generated (e.g., by wireless sensors) in many real-life
applications (e.g., environment surveillance). This leads to stream mining [ 22 ].
8.1
SUF-growth
To mine frequent patterns from streams of uncertain data, Leung and Hao [ 36 ]
extended the UF-growth algorithm (ref. Sect. 5.1) to produce an exact mining al-
gorithm called SUF-growth . During the mining process, the (i) sliding window
model, (ii) time-fading model, or (iii) landmark model is commonly used in pro-
cessing batches of transactions in the data streams. When using a sliding window
model, SUF-growth captures only the contents of streaming data in batches of trans-
actions belonging to the current window (that captures the recent w batches) in a tree
structure called SUF-tree . When the window slides, SUF-growth removes from the
SUF-tree those data belonging to older batches and adds to the SUF-tree those data
belonging to newer batches. Hence, each tree node in the SUF-tree consists of three
components: (i) an item, (ii) its existential probability, and (iii) a list of its w occur-
rence counts in the path. By doing so, when the window slides, the oldest occurrence
counts (representing the oldest streaming data) are replaced by the newest occurrence
 
Search WWH ::




Custom Search