Information Technology Reference
In-Depth Information
Establishing an activity's granularity level is also a recurrent challenge in
process mining, where logs contain records that are often very fine-granular. As
such, the process models directly mined from the logs can be overloaded with
information making them hard to comprehend. Activity clustering is an ecient
means to raise the abstraction level for the mined models. In [14,15] Gunther
and van der Aalst propose activity aggregation mechanisms based on clustering
algorithms. The mechanisms extensively use information present in process logs,
but which are less common for process models, i.e., timestamps of activity starts
and stops, activity frequencies, and transition probabilities. Thus, in contrast
to the activity aggregation approach proposed in this paper, process mining
considers other activity property types for clustering and utilize other clustering
algorithms.
5 Conclusions and Future Work
Despite business process model abstraction has been addressed in a number of
research endeavors, this paper proposes a novel approach in this area. Specifically,
it exploits semantic aspects—beyond the control-flow perspective—to determine
a similarity between different activities for the purpose of simplifying process
model abstraction. Relevant levels of similarity can be determined on the basis
of existing process models in which abstraction was already applied.
Our main contribution is a method to discover sets of related activities, where
each set corresponds to a coarse-grained activity of an abstract process model.
As a second contribution, we propose an approach to discover an abstraction
style inherent to a given process model collection, which is reusable for ab-
straction of new process models. Both contributions are of practical interest,
as they addresses model management issues recurrently appearing in process
model projects. The experimental validation provides strong support for the
applicability and effectiveness of the presented ideas.
Our approach is characterized by a number of limitations and assumptions.
First of all, it builds on the assumption that all kinds of semantic information,
such as data objects, roles, and resources, can be observed within the descrip-
tions of process models in industrial collections. The process model collection
we obtained through our cooperation with a large telecommunication company
clearly confirms this idea, but this also applies to other industrial repositories,
such as the SAP Reference Model [20]. Secondly, in our validation we have merely
focused on the appearance or not of two activities being within the same sub-
process or process model, although it can be imagined that a more fine-grained
correspondence measure could yield even more useful results.
These and other limitations guide our future research plans. The direct next
step for us is the use of advanced vector space models reflecting the relations
between different activity property values. Such models enable activity cluster-
ing algorithms to consider the structure of organigrams and data object rela-
tions. Meanwhile, it can also be beneficial to consider other clustering algorithms
and compare the outcome with the solution introduced in this paper. From a
 
Search WWH ::




Custom Search