Information Technology Reference
In-Depth Information
for each pair of activities we can observe the outcome: Either the activities be-
long to different subprocesses or to the same one. For a process model collection
c =( M, A, P, σ ) function diff formalizes this observation:
dif f ( a k ,a l )= 0 ,
if a k ,a l
A i ;
(4)
1 ,
otherwise.
To mine the process model collection fingerprint
we select its value in such a
way that the behavior of function dist agg approximates the behavior of diff .The
discovery of vector
W
is realized by means of linear regression. In our setting,
the values dist t are considered independent variables and the value of function
diff the the dependent variable. Components of vector
W
are the regression
coecients. The standardized coecients indicate the impact of each activity
property type on the abstraction style. Hence, it is possible to reveal criteria em-
ployed by the human designer during abstraction. Furthermore, the regression's
coecient of determination R 2 allows to judge how well the obtained statisti-
cal model explains the observed behavior. For our purposes, R 2 suggests if the
discovered statistical model can be used for business process model abstraction.
W
3 Empirical Validation
The proposed activity aggregation mechanism calls for validation. The goal of
the validation is to learn how well the proposed operation approximates the
abstraction style of human modelers. We performed an empirical validation of
the approach by conducting an experiment with a real world business process
model collection. This section provides a detailed discussion of the validation; it
describes in detail the explored process model collection, explains the experiment
design, and discusses the validation results.
3.1 Validation Setup
As a research object we choose a set of business process models from a large
telecommunication service provider. This organization is currently in the process
of setting up a repository with high-quality process models, which are brought
together for the purpose of consultation and re-use by business users. The model
set includes 30 elaborate models, enriched with activity properties of the follow-
ing types: roles , responsible roles , IT systems ,and data objects . It is noticeable
that a special type of roles, i.e., responsible roles , is also distinguished in these
models. In addition to these non-control flow types of information, we also study
the impact that activity labels and activity neighboring control flow elements
have on the decision to aggregate activities into the same subprocess. To com-
pare activities with respect to their labels, the corresponding vector space is
formed by the words that appear in the labels. Against this background, finding
the distance between activities becomes an information retrieval task as labels
can be treated as documents in information retrieval. The comparison of activi-
ties with respect to their neighbors shows whether the neighborhoods of the two
Search WWH ::




Custom Search