Information Technology Reference
In-Depth Information
Definition 9: We define a Class of a Sequence S, denoted S C , as the cluster composed
of multidimensional sequences whose sequences are similar to S, where S is a sequen-
tial pattern .
In order to define characteristic rules, we adopt and formalize the definition given
in [9]. They define characterization of a sub-set as the property descriptions specific
to this sub-set, comparing to all objects in the database.
Definition 10: We denote se a subset of the database DB, prop a multidimensional
property (a i1 , …, a ik ), freq se ( prop ) the number of objects in se that meet the property
prop ; and card( se ) the cardinality of se . The significance of prop in the subset se is
defined as: F DB se ( prop ) = (freq se ( prop )/card( se )) / (freq DB (prop)/card( DB ))
Definition 11: Given a real R standing for the significance threshold. prop is said
characteristic of se , and denoted as: prop Î se [significance], if and only if:
F DB se ( prop ) = significance R .
Definition 12: Let S c be the class of a sequential pattern SP c . We define a multidi-
mensional sequential rule as: prop Î SP c [ significance ].
This multidimensional sequential rule means that the multidimensional property
prop is characteristic of the sequential pattern SP c with the computed significance .
The example of table 8.1 shows a multidimensional sequence database . The tuple
(1, <s 1 ,s 2 , .., s n >, a 1 , …, a m ) stands for a multidimensional sequence of the database.
Table 8.1. A Multidimensional Sequence Database
S
A 1
A 2
RID
1
A i
A m
<s 1 , s 2 , .., s i , .., s n >
a 1
a 2
a i
a m
k
8.2.3 Description of the Use Case and Datasets
The target application is related to population time-use analysis and more precisely
their daily activities and displacements. This dataset describes daily activities and
displacements carried out by each person of a surveyed household at the scale of a
whole urban area. It can be seen as a sequence of activities, also called activity pro-
gram [24]. For example, during a day, an individual can leave home, drive children to
school, go to work, pick children up from school and come back to home. This
sequence can be described as (Home, School, Work, School, Home). In order to sim-
plify the notations, we represent each activity by a specific character, e. g. H for
Home, W for Work, and S for School. Other activities are Market (denoted M),
Restaurant (R), Leisure (L), etc. This alphabet can be as long as necessary. Then, by
removing the comma separators, a sequence could be simplified to a character string,
e.g. HSWSH for the previous sequence. Although we have used activity programs as
an example in our experiments, the analysis is also relevant for other sequences, such
as the transport mode used for displacements, the departure time, and so on.
Search WWH ::




Custom Search