Database Reference
In-Depth Information
Algorithm 1. Pattern-Growth Approach
1: Input: SLU probabilistic database D p and support threshold θ .
2: Output: All sequences s with ES ( s, D p ) ≥ θm .
3: L 1 ComputeFrequent-1-sequences( D p )
4: for all sequences x ∈ L 1 do
5: Compute B i,x arrays
6: Call ProjectedDB( x )
7: function ProjectedDB( s )
8: L S Compute Frequent S-extensions { fast L 1 computation }
9: L I Compute Frequent I-extensions { fast L 1 computation }
10: Output all Frequent Sequences {s extended with x , for all x in L S and L I }
11: for all x ∈ L S do
12: t ←s ·{x} { S-extension }
13: Compute B i,t arrays
14: ProjectedDB( t )
15: for all x ∈ L I do
16: t ←s 1 ,...,s q ∪{x} { I-extension }
17: Compute B i,t arrays
18: ProjectedDB( t )
19: end function
arrays for each source (Line 5) and also keep track of all the sources where
Pr[ x D i
] > 0 (projected database) and then, call the ProjectedDB( x ) sub-
routine (Line 6).
In the ProjectedDB sub-routine, we first compute all the frequent S- and
I-extensions of s using the fast L 1 computation by applying Eq. 5 and Eq. 6
accordingly (Line 8 and 9). In step 10, output all the frequent S- and I-extensions
of s computed in the previous steps. In steps 11-18, for every sequence t which
is a frequent S- or I-extension of s , compute B i,t
arrays and also keep track of
all the sources where Pr[ t D i
] > 0, and call the ProjectedDB sub-routine
recursively to mine all frequent sequential patterns.
Table 3. Number of DP computations (in millions) performed by each algorithm, for
the set of experiments in Fig. 1 (L) and in Fig. 2 (R)
BFS+P DFS+P PGA
θ = 0.5 % 3.141
C10D10K
C =10 =1% BFS+P DFS+P PGA
D = 10K
3.199 1.494
1.465
1.711
0.886
θ = 1 % 1.711
1.465 0.886
D = 20K
2.890
3.370
1.735
θ = 2 % 0.879
0.781 0.487
D = 40K
5.716
6.718
3.464
θ = 4 % 0.505
0.487 0.281
D = 80K
11.460
13.423
6.936
gazelle
D
=10
= 25%
C = 10
K, θ
θ
= 0.01 % 0.777
0.375 0.261
0.100
0.111
0.081
θ
= 0.02 % 0.149
0.172 0.152
C = 20
0.690
0.694
0.314
θ
= 0.03 % 0.073
0.129 0.122
C = 40
13.044
13.891
4.868
θ
= 0.04 % 0.045
0.110 0.107
C = 80
2353.677 2782.807 881.480
 
Search WWH ::




Custom Search