A Goal-Based Approach for Learning in Business Processes - Intentional Perspectives on Information Systems Engineering

Information Technology Reference

In-Depth Information

Step 1 : Use existing domain knowledge for an initial classification of process

instances based on contextual properties that are known to affect process

behavior.

For each partition, separately apply the following three steps.

Step 2 : Establish the behavioral similarity of the process instances.

(a) Path similarity categories are formed using a clustering algorithm over

path data of the instances. The number of path similar clusters generated is

selected according to goodness of fit criteria, such as Akaike Information

Criteria (AIC). The clustering algorithm can be applied several times,

achieving a series of clustering results with an increasing number of clus-

ters for each clustering set. Finally, the best cluster set is selected as the

one that attains the first minima of the ratio of AIC changes.

(b) Categorize termination states to a small number of categories based

on a set of predefined rules. The aim is to achieve a coarse grained

categorization with a clear distinction between categories.

(c) Combine path similarity categories with termination state categories into

behavioral similarity categories.

Step 3 : Establish the contextual properties that affect behavior. This is accom-

plished by training a decision tree algorithm, using the context data as

inputs and the behavioral categories as dependent variable (their label). The

objective of using the decision tree is to discover the contextual semantics

behind each behavioral category. We use a modified Chi-squared Automatic

Interaction Detection (CHAID) growing decision tree algorithm to construct

the decision tree that represents the context groups and their relationships.

CHAID tries to split the context data of the process instances into nodes that

contain instances whose dependent variable values (namely, behavioral simi-

larity category) are the same. Each path from the source node to a leaf node in

the decision tree represents a different combination of contextual properties.

Each leaf node contains a certain distribution of instances among behavioral

categories, allowing the identification of the most probable category for that

leaf.

Step 4 : Form the context groups. Based on Postulates 1 and 2, join tree paths

into context groups if the following two conditions are satisfied:

(a) The hypothesis that the process instances in their leaves are of the same

population (considering their behavioral similarity categories) cannot be

rejected.

(b) If their leaves include behavioral categories that stand for similar paths

but different termination states, the hypothesis that termination states for

similar paths in both leaves are of the same population cannot be rejected.

Search WWH ::

Custom Search

Home