Databases Reference
In-Depth Information
The next definition extends the concept of generator itemset to the do-
main of sequences. Different sequences can have the same sequential closure,
i.e., they are represented in the same closed sequence. Among the sequences
with the same sequential closure, the shortest sequences are called generator
sequences.
Definition 7 (Generator Sequence). An arbitrary sequence X in
D
is a
generator sequence iff there is not a sequence Y in
D
such that (i) Y
Ψ X
and (ii)sup Φ ( X )= sup Φ ( Y ) .
Special cases of the above definitions are the contiguous closed sequence
and the contiguous generator sequence , where the matching functions in set Ψ
define a contiguous subsequence relation. Instead, we have an unconstrained
closed sequence and an unconstrained generator sequence when Ψ defines an
unconstrained subsequence relation.
Knowledge about generators associated to a closed sequence X allow
generating all sequences having X as sequential closure. For example, let
closed sequence X be associated to a generator sequence Z .Consideran
arbitrary sequence Y with Z
Ψ X . Then, X is the sequen-
tial closure of Y . From Property 2, it follows that sup Φ ( Z )
Ψ Y and Y
sup Φ ( Y )and
sup Φ ( Y )
sup Φ ( X ). Being X the sequential closure of Z , Z and X have
equal support. Hence, Y has the same support as X . It follows that sequence
X is the sequential closure of Y according to Definition 6.
In the example dataset, ADBA is a contiguous closed sequence with sup-
port 33.33% under the maximum gap constraint 2. ADBA represents con-
tiguous sequences BA , DB , DBA , ADB , ADBA which satisfy the same gap
constraint. BA and DB are contiguous generator sequence for ADBA .
In the context of association rules, an arbitrary itemset has a unique clo-
sure. The property of uniqueness is lost in the sequential pattern domain.
Hence, for an arbitrary sequence X the sequential closure can include sev-
eral closed sequences. We call this set the closure sequence set of X , denoted
CS
( X ). According to Definition 6, the sequential closure for a sequence X is
defined based on the pair of matching functions ( Ψ,Φ ). Being a collection of
sequential closures, the closure sequence set of X is defined with respect to
the same pair ( Ψ,Φ ).
Property 3. Let X be an arbitrary sequence in
D
and
CS
( X ) the set of
sequences in
which are the sequential closure of X. The following properties
are verified. (i) If X is a closed sequence, then
D
CS
( X ) includes only sequence
X. (ii) Otherwise,
CS
( X ) may include more than one sequence.
In Property 3, case (i) trivially follows from Definition 5. We prove case (ii)
by means of an example. Consider the contiguous closed sequences ADCA and
ACA , which satisfy maximum gap 2 in the example dataset. The generator
sequence C is associated to both closed sequences. Instead, D is a generator
only for ADCA . From Property 3 it follows that a generator sequence can
generate different closed sequences.
Search WWH ::




Custom Search