Information Technology Reference
In-Depth Information
A New Similarity Search Approach
on Process Models
Siyun Li and Jian Cao
Department of CSE, Shanghai Jiao Tong University,
Dongchuan Rd. 800, 200240 Shanghai, China
{lisiyun,cao-jian}@sjtu.edu.cn
Abstract.
We investigate the problem of similarity search query in pro-
cess model repositories: given a certain target model, compare with the
process models in the repository and find their similar pattern. We seek
to find an effective way to mine out the similar patterns. Using four rep-
resentative models, we evaluate a new approach, with semantic and topo-
logical consideration accordingly. The experimental results show that the
combination of semantic and topological analysis brings higher retrieval
quality in the similarity search on process models.
Keywords:
process model, similarity search, process mining.
1
Introduction
In the large enterprises, their model repositories usually have large number of
various process models [1]. In various application scenarios people need to com-
pare process models and retrieve relevant patterns from model repositories. For
example, when compressing the model repositories, similarity search enables one
to detect the similar and relevant model patterns, which can be extracted out
as a shared model component. Meanwhile, the retrieved patterns can be utilized
as recommended references when one builds process models. What's more, sim-
ilarity search helps to figure out the co-relationship of a wide variety of process
models across different application domains.
In this paper, we focus on the problem of similarity search query in pro-
cess model repositories: given a certain target model, compare with the process
models in the repository and find out their similar patterns. To tackle this prob-
lem, we need to determine the degree of similarity between pairs of models.
Previous work has been done for this from several perspectives including text
similarity, structural similarity and behavior similarity [2]. However, they are
not completely satisfactory in the real-life applications, due to the limit either
in the retrieval quality or computational complexity. This paper proposes a new
approach with causal similarity mining processes to strike a tradeoff between
retrieval quality and computational complexity.
The rest of the paper is structured as follows. Section 2 elaborates the process
model notion and similarity metric used in this paper. Section 3 presents the de-
tails of two approaches to mine out similar patterns out of process model pairs.