Database Reference
In-Depth Information
estimate beforehand how many patterns will be generated with a particular threshold
[ 3 ].
An algorithm called TSP [ 38 ] is proposed to discover top- k closed sequential pat-
terns of length no less than min_l , where k is the desired number of closed sequential
patterns to be mined and min_l is the minimal length of each pattern. TSP is a multi-
pass search space traversal algorithm that finds the most frequent patterns early in
the mining process and allows dynamic raising of min_support which is then used
to prune unpromising branches in the search space. Also, TSP devises an efficient
closed pattern verification method which guarantees that during the mining process
the candidate result set consists of the desired number of closed sequential patterns.
The efficiency of TSP is further improved by applying the minimum length con-
straint in the mining and by employing the early termination conditions developed
in CloSpan [ 25 ].
5.7
Frequent Episode Mining
Introduced by Mannila et al. [ 39 ], an episode is defined as a collection of events that
occur relatively close to each other in a given partial order, and the task of frequent
episode mining is to find all episodes that occur frequently in an event sequence,
given a class of episodes and an input sequence of events. The algorithm proposed
in [ 39 ] works iteratively, alternating between building and recognition phases. First,
in the building phase of an iteration i , a collection C i of new candidate episodes of
i elementary events is built, using the information available from smaller frequent
episodes. Then, these candidate episodes are recognized in the event sequence and
their frequencies are computed. The collection L i consists of frequent episodes in C i .
In the next iteration i
1, candidate episodes in C i + 1 are built using the information
about the frequent episodes in L i . The algorithm starts by constructing C i to contain
all episodes consisting of single elementary events. In the end, the frequent episodes
in L i for all i are output. This is a typical Apriori-like algorithm under which the
downward closure principal holds, that is, all subepisodes of a frequent episode are
frequent.
Tatti and Cule [ 40 , 41 ] introduced a technique for discovering closed episodes.
They introduced a new subclass of general episodes, called strict episodes, which
can ease the computational burden caused by the traditional closure-definition of
serial episodes. An episode is strict if all nodes with the same label are connected.
This class of strict episodes is large, containing all serial and parallel episodes, as
well as episodes with unique labels. A natural subset relationship between episodes
is introduced based on the subset relationship of sequences covering the episodes,
and the subset relationship can be computed efficiently for strict episodes. In order to
mine closed episodes they defined an auxiliary closure operator. This closure satisfies
the needed properties so that they can use the existing framework for mining closed
patterns. Discovering the true closed episodes is done via a post-processing step.
+
Search WWH ::




Custom Search