Information Technology Reference
In-Depth Information
4.2 Features Capturing the Manifestation of Activity Relationships
We distinguish between two classes of features (i) global features and (ii) local
features. Global features are defined over an event log while local features can
be defined at a trace level. Based on the follows (precedes) relation, we propose
two global features viz., Relation Type Count and Relation Entropy, and two
local features viz., Window Count and J -measure. These features are defined as
follows:
- Relation Type Count (RC): The relation type count with respect to follows
(precedes) relation is a function f RC : Σ
0 defined over the set of
N
activities. f RC of an activity, b
Σ with respect to follows (precedes) relation
over an event log L is a triple c a ,c s ,c n
where c a ,c s , and c n are the number
of activities in Σ that always, sometimes, and never follow (precede) b in
L
respectively. For the event log
L
mentioned above, f RC ( a )=
2 , 9 , 0
since
e and h always follows a while all other activities in Σ
\{ e , h }
sometimes
follows a . f RC ( i )=
since only j always follows i ; b , d , e ,and k
sometimes follows i while a , c , f , g , h and i never follows i .
For an event log containing
1 , 4 , 6
|
Σ
|
activities, this results in a feature vector
of dimension 3
|
Σ
|
(if either follows or precedes relation is considered) or
(if both follows and precedes relation are considered).
- Relation Entropy (RE): The relation entropy with respect to follows
(precedes) relation is a function f RE : Σ
2
×
3
|
Σ
|
R
+ defined over the set of
activities. f RE of an activity,
Σ with respect to follows (precedes)
relation is the entropy of the relation type count metric. In other words,
f RE ( b )=
b
p a log p a
p s log p s
p n log p n where p a = c a /
|
Σ
|
,p s = c s /
|
Σ
|
,
and p n = c n /
.
For the above example event log
|
Σ
|
L
, f RE ( a )=0 . 68 (corresponding to f RC ( a )=
2 , 9 , 0
)and f RE ( i )=1 . 32 (corresponding to f RC ( i )=
1 , 4 , 6
). For an
activities, this results in a feature vector of dimen-
sion |Σ| or 2 ×|Σ| depending on whether either or both of follows/precedes
relation is considered.
- Window Count (WC): The window count with respect to follows (precedes)
relation is a function f WC : Σ
event log containing
|
Σ
|
N 0 defined over the set of activity pairs.
Given a trace t and a window of size l ,let S l be the set of all subsequences
t ( i, i + l
×
Σ
1), such that t ( i )= a and there exists a j such that i<j<i + l
and t ( j )= b . The window count of the relation b follows a is defined as
the number of sequences of length l in which b follows a .Inotherwords,
f WC ( a , b )=
.
For the above example event log
|
S l |
, using a window of size l =4, f WC ( a , b )=
1 for trace acaebfh and 0 for traces ahijebd and aeghijk .
- J-Measure: Smyth and Goodman [11] have proposed a metric called J -
measure based on [12] to quantify the information content (goodness) of
a rule. We adopt this metric as a feature to characterize the significance of
relationship between activities. The basis lies in the fact that one can con-
sider the relation b follows a as a rule: “if activity a occurs, then activity
b will probably occur”. The J -measure with respect to follows (precedes)
relation is a function f J : Σ
L
+ defined over the set of activity pairs.
Let p ( a )and p ( b ) denote the probability of occurrence of activities a and b
respectively in a trace t .Let p l ( a F b ) denote the probability that b follows
×
Σ
R
 
Search WWH ::




Custom Search