Information Technology Reference
In-Depth Information
UP to now, several DHT-based publish/subscribe systems have been proposed,
including topic-based [15-16] and content-based [2-4] systems. Most of them are
designed on the basis of existing DHT networks, and the other proposals are put
forward based on novel DHT structures [5].
In this paper, we propose a DHT-based publish/subscribe system that can be ap-
plied on any DHT overlay networks. In DMPSS, we only provide methods that
specify where subscriptions are stored and where events are published to. Users can
choose the overlay network according to their needs. And the routing mechanism of
the message is decided by the overlay that the user chooses.
2.3
Frequent Itemset Mining
Frequent itemsets are itemsets that appear in a data set with frequency no less than
a user-specified threshold. For example, a set of items, such as milk and bread, that
appears frequently together in a transaction data set, is a frequent itemset. Given a
itemset database D , the support of a itemset
α
is the percentage of itemsets in D
which contain
α
.Let I
= {
i 1 ,
i 2 ,...,
i n }
be a set of all items. A k -itemset
α
,which
consists of k items from I , is frequent if the support of
α
no lower than a user-
specified minimum support threshold [17].
In DMPSS, we extract attribute sets from subscriptions and find frequent itemsets
using frequent itemset mining method, such as FP-growth [18] and Apriori [19].
Then we install subscriptions and events in terms of frequent itemsets. The details
of DMPSS are illustrated in the next section.
3Sy emD ign
The usage of frequent itemset for subscription and event installation is the most im-
portant innovation of DMPSS. The frequent itemset based policy of DMPSS realizes
two advantages. First, it achieves even matching load distribution on RPs. Second,
it reduces the event publication cost by grouping subscriptions that don't contain
frequent itemsets onto a certain number of RPs.
In this section, details of DMPSS are presented in Section 3
.
1 - Section 3
.
6.
Section 3
.
7 introduces a load balancing policy.
3.1
Frequent Itemsets Mining from Historical Recorders of
Subscriptions
At first, we extract attribute sets from historical records of subscriptions. Then set
the minimum support threshold minsup and find the frequent itemsets using frequent
itemset mining method.
In order to management subscriptions that don't contain frequent itemset, we
select M non-frequent itemset randomly, and call them virtual frequent itemset. The
value of M is limited by
Search WWH ::




Custom Search