Information Technology Reference
In-Depth Information
to be in steady state . Starting point for process mining is an event log contain-
ing a sequence of business events recorded by one or more information systems.
Based on such an event log, processes can be discovered. Today's process dis-
covery techniques are able to extract meaningful process models from event logs
not containing any explicit process information. Using ProM, we have analyzed
processes in more than 100 organizations. These practical experiences show that
it is very unrealistic to assume that the process being studied is in steady state:
while analyzing the process, changes can take place. For example, governmen-
tal and insurance organizations reduce the fraction of cases being checked when
there is too much work in the pipeline. In case of a disaster, hospitals and banks
change their operating procedures etc. Such changes are indirectly reflected in
the event log. Moreover, analyzing such changes is of the utmost importance
when supporting or improving operational processes.
In the data mining and machine learning communities, such second-order dy-
namics are referred to as concept drift , and has been studied in both supervised
and unsupervised settings. Concept drift has been shown to be important in
many applications and several successful stories have been reported in the liter-
ature [1,2,3]. However, existing work tends to focus on simple structures such as
changing variables rather than changes to complex artifacts such as process mod-
els describing concurrency, choices, loops, cancelation, etc. In handling concept
drifts in process mining, the following three main problems can be identified:
1. Change (Point) Detection: The first and most fundamental problem is to
detect concept drift in processes, i.e., detect that a process change has taken
place. If so, the next step is to identify the time periods at which changes
have taken place.
2. Change Localization and Characterization: Once a point of change has been
identified, the next step is to characterize the nature of change, and identify
the region(s) of change (localization) in a process. Uncovering the nature
of change is a challenging problem that involves both the identification of
change perspective (for example, control-flow, data, resource, sudden, grad-
ual etc.) and the exact change in itself.
3. Unravel Process Evolution: Having identified, localized and characterized the
changes, it is necessary to put all of these in perspective. There is a need
for techniques/tools that exploit and relate these discoveries. Unraveling the
evolution of a process should result in the discovery of the change process
(describing the second order dynamics).
In this paper, we focus on the first two problems. We propose features and
techniques to detect changes (drifts), change points, and change localization
in event logs from a control-flow perspective. The techniques proposed in this
paper show significant promise in handling concept drifts. We further provide
an outlook on some of the topics in concept drift and believe that this niche
area, with its broad scope and relevance, evokes lots of interest in the research
community.
The remainder of this paper is structured as follows. Related work is presented
in Section 2. Section 3 describes the various aspects and nature of change. Section
4 introduces various features and techniques for detecting drifts in event logs.
Section 5 describes the effectiveness of the features and techniques proposed in
Search WWH ::




Custom Search