Information Technology Reference
In-Depth Information
In this context, it is a paradox that acquisition is costly although detailed
information about processes is often already available in the form of informal
textual specifications. Such textual documents can be policies, reports, forms,
manuals, content of knowledge management systems, and e-mail messages. Con-
tent management professionals estimated that 85% of the information in com-
panies is stored in such an unstructured format [6]. Moreover, the amount of
unstructured text is growing at a much faster rate than structured data [7]. It
seems reasonable to assume that these texts are relevant sources of information
for the construction of conceptual models.
In this paper, we develop an approach to directly extract business process
models from textual descriptions. Our contribution is a corresponding technique
that does not make any assumptions about the structure of the provided text. We
combine an extensive set of tools from natural language processing (NLP) in an
innovative way and augment it with an anaphora resolution mechanism, which
was particularly developed for our approach. The evaluation of our technique
with a set of 47 text-model pairs from industry and textbooks reveals that on
average 77% of the model is correctly generated. We furthermore discuss current
limitations and directions of improvement.
The paper is structured as follows. Section 2 introduces the foundations of
our approach, namely BPMN process models and natural language processing
techniques. Section 3 identifies a set of language processing requirements, and
illustrates how they are tackled in the various steps of our generation approach.
Section 4 presents our evaluationresultsbasedonasampleoftext-modelpairs.
Section 5 discusses related work before Section 6 concludes the paper.
2 Background
Generating models builds on understanding the essential concepts of BPMN
process models and of state-of-the-art techniques for natural language processing.
In this section, we introduce BPMN and then natural language processing tools.
The Business Process Model and Notation (BPMN) is a standard for process
modeling that has been recently published in its version 2.0 [8]. It includes four
categories of elements, namely Flow Objects (Activities, Events and Gateways),
Swimlanes (Pools and Lanes), Artifacts (e.g. Data Objects, Text Annotations
or Groups), and Connecting Objects (Sequence Flows, Message Flows and Asso-
ciations). The first three are nodes, the latter ones are edges. Figure 1 shows a
BPMN example of a claims handling process provided by QUT. The process is
subdivided into three pools (one with two lanes) capturing the actors of the pro-
cess. Activities are depicted as rounded boxes. Different events (round elements
with icons) for sending and receiving messages affect the execution of the process.
The diamond-shaped elements define specific routing behavior as gateways.
A BPMN process model is typically the result of analyzing textual descrip-
tions of a process. A claims handling process provided by QUT is described as
follows: “The process starts when a customer submits a claim by sending in
relevant documentation. The Notification department at the car insurer checks
 
Search WWH ::




Custom Search