Uncertain Frequent Pattern Mining - Frequent Pattern Mining

Database Reference

In-Depth Information

Table 14.9 Vectors for domain items in the probabilistic dataset D 2

Vector for every domain item

⎛

⎝

⎞

⎠

⎛

⎝

⎞

⎠

⎛

⎝

⎞

⎠

⎛

⎝

⎞

⎠

⎛

⎝

⎞

⎠

0 . 2

0 . 6

0 . 9

0 . 6

0 . 5

0 . 2

0 . 4

0 . 6

0 . 8

0 . 9

0 . 5

0 . 7

0 . 3

− b =

− d =

− a =

− c =

− e =

and

other representations. For instance, Leung et al. [ 44 ] proposed the U-VIPER algo-

rithm, in which D is vertically represented by a collection of fixed-size vectors—one

for each domain item x . The length of each vector is fixed and is equal to the number

of transactions (i.e.,

n ) in the probabilistic dataset of uncertain data. When

mining uncertain data, in addition to using a Boolean value (say, 0 or 1) to denote

whether or not transaction t i contains x in the vector, it is also important to capture

additional information: If x is likely to be present in a transaction t i , then its associ-

ated existential probability P ( x , t i )—which expresses the likelihood of x appearing

in transaction t i of the dataset—needs to be captured.

As it would be a waste of space to augment P ( x , t i ) to the Boolean value “1” for

t i (i.e., the i -th element of the vector for x ), U-VIPER replaces the Boolean value

“1” by P ( x , t i )asthe i -th element of each vector − x representing domain item x .

Specifically, the i -th element of − x (denoted as − x [ i ]) stores (i) “0” if x is absent

from t i and (ii) P ( x , t i )if x is likely to be present in t i :

i -th element of − x (i.e., − x [ i ])

expSup (

{

}

, t i )

if x

∈

t i

(14.10)

P ( x , t i ) f x

∈

t i

See Table 14.9 . With this vector-based representation, the expected support of any

1-itemset

can be computed by summing all non-zero P ( x , t i ) values in − x (i.e.,

taking the L 1 -norm of − x ):

{

}

expSup (

{

}

, D )

expSup (

{

}

, t i )

P ( x , t i )

1 − x [ i ]

=|| − x

|| 1

(14.11)

The i -th element of the vector of any ( k +1)-itemset X

≡

∪{

}

(where Y is

a k -itemset and z is an item) for k

1 can be formed by taking the product of

expSup ( Y , t i ) and P ( z , t i ). The expected support of X is then the dot product of − Y

and − {

≥

}

Frequent Pattern Mining

Search WWH ::

Custom Search

Home