Information Technology Reference
In-Depth Information
- σ
M is a subprocess relation refining a process model with subpro-
cess models, such that
M
×
m i ,m j
M ,where j =1 , 2 ,...,n and i
= j ,if
σ + ,where σ + is a transitive reflexive closure
( m i ,m j )
σ then ( m j ,m i ) /
of σ .
Definition 3 explicitly enumerates the model collection activities and property
value types. The relation σ formalizes the subprocess relation that exists between
models. Note that according to the definition, σ enables only a process model
hierarchy without loops. Without loss of generality in the remainder of this
paper we discuss abstraction of process models within a process model collection.
Indeed, a process model m i
can be seen as a trivial process model collection
c =(
{
m i }
,A i ,P i ,
).
2.2 Activity Aggregation as Cluster Analysis Problem
In this paper we interpret activity aggregation as a problem of cluster analysis.
Consider process model m i =( A i ,G i ,F i ,P i ,props i ) from process model collec-
tion c =( M, A, P, σ ). The set of objects to be clustered is the set of activities
A i . The objects are clustered according to a distance measure: objects that are
“close” to each other according to this measure are put together. The distance
between objects is evaluated through analysis of activity property values P .The
cluster analysis outcome, activity clusters, correspond to coarse-grained activi-
ties of the abstract process model. While cluster analysis provides a large variety
of algorithms, e.g., see [29], we focus on one algorithm that suits the business
process model abstraction use case in focus.
In the considered scenario, the user demands control over the number of activ-
ities in the abstract process model. For example, a popular practical guideline is
that five to seven activities are displayed on each level in the process model [30].
Provided a fixed number, e.g. 6, the clustering algorithm has to assure that the
number of clusters equals the request by the user. We turn to the use of k-means
clustering algorithm, as it is simple to implement and typically exhibits good
performance [16]. K-means clustering partitions an activity set into k clusters.
The algorithm assigns an activity to the cluster, which centroid is the closest
to this activity. To evaluate an activity distance, we analyze activity property
values P . We foresee a number of alternative activity distance measures and
elaborate on them in the next section.
2.3 Activity Distance Measures
To introduce the distance measure among activities we represent activities as vec-
tors in a vector space. Such an approach is inspired by the vector space model, an
algebraic model widely used in information retrieval [28]. The space dimensions
correspond to activity property values P and the vector space can be captured
as vector ( p 1 ,...,p |P | ), where p j
P for j =1 ,...,
|
P
|
.Consideranexampleset
of property values P =
{
FA data, QA data, Raw data
}
and the corresponding
vector space presented in Fig. 2. A vector
A i in
process model m i =( A i ,G i ,F i ,P i ,props i ) is constructed as follows. If activity a
v a representing an activity a
 
Search WWH ::




Custom Search