Information Technology Reference
In-Depth Information
It indicates that
CG
is a subset of
CA
, and may be much smaller than
k
k
CA
,and then can be candidate sets for the size-
k
large itemsets. The differences
k
between
CG
and
CA
depends on distributing degree of the itemsets. The set
k
k
of candidate sets
CG i
can be generated locally at each site
S
at the k-th iteration.
k
i
After the exchange of support counts, the gl-large itemsets
GL i
in
CG i (
k
)
can be
k
found at the end of that iteration. Based on
GL i
,
t he candidate sets at
S i
for the
k
(
+ 1)-st iteration can then be generated. By using this approach, the number of
candidate sets generated can be substantially reduced to about 10 - 25% of that
generated in Count Distribution.
k
12.6.2 Local pruning of candidate sets
When the set of candidate set C'G (k) is generated, to find the globally large
itemsets, the support counts of the candidate sets must be exchainged among all
the sites. Notice that some candidate sets in CG (k) can be pruned by a local
pruning technique before count exchange starts. The general idea is that at each
site S i , if a candidate set X CG i
is not locally large at site S i , there is no need
for S i , to find out its global support count to determine whether it is gllobally
large. This is because in this case, either X is small, or it will be locally large at
some other site, and hence only the site(s) at which X is locally large need to be
responsible to find the global support count of X . Therefore, in order to compute
all the large k-itemsets, at each site Si, the candidate sets can be confined to only
the sets X CG i
k
which are locally large at site Si. For clarity, the notations
used in this section are listed in Table 12.2.
k
Table 12.2 Notation list
D
number of transaction in DB
minimum support minsup
Globally large k -itemsets
candidate sets generated from L ( k )
global support total count of X
number of transactions in DB i
Gl-large k -itemsets at site S i
candidate sets generated from GL i
s
L(k)
CA(k)
X.sup
DI
GLi k
CGi k
LLi k
X.supi
k
-1
local large k -itemsets in CG i
local support total count at site S i
k
Search WWH ::




Custom Search