Process Model Generation from Natural Language Text - Advanced Information Systems Engineering

Information Technology Reference

In-Depth Information

Afterwards, each sentence is parsed by the Stanford Parser using the factored

model for English [11]. We utilize the factored model and not the pure proba-

bilistic context free grammar, because it provides better results in determining

the dependencies between markers as “if” or “then”, which are important for

the process model generation. Next, complex sentences are split into individual

phrases. This is accomplished by scanning for sentence tags on the top level of

the Parse Tree and within nested prepositional, adverbial, and noun phrases.

Once the sentence is broken down into individual constituent phrases, actions

can be extracted. First, we determine whether the parsedSentence is in active

or passive voice by searching for the appropriate grammatical relations (Issue

1.1). Then, all Actors and Actions are extracted by analyzing the grammatical

relations. To overcome the problem of example sentences mentioned earlier (Issue

3.2) the actions are also filtered. This filtering method simply checks whether the

sentence contains a word of a stop word list called example indicators . Then, we

extract all objects from the phrase and each Action is combined with each Object.

The same is done with all Actors. This procedure is necessary as an Action

is supposed to be atomic according to the BPMN specification [8] and Issue

2.1. Therefore, a new Action has to be created for each piece of information as

illustrated in the following example sentences. In each sentence the conjunction

relation which causes the extraction of several Actors, Actions or Resources is

highlighted. As a last step, all extracted Actions are added to the World Model.

◦

“Likewise the old supplier creates and sends the final billing to the cus-

tomer.” (Action)

◦

“It is given either by a sales representative or by a pre-sales employee

in case of a more technical presentation.” (Actor)

◦

“At this point, the Assistant Registry Manager puts the receipt and

copied documents into an envelope and posts it to the party.” (Resource)

3.2 Text Level Analysis

This section describes the text level analysis. It analyzes the sentences taking

their relationships into account. The structural overview of this phase is shown

in Figure 3. We use the Stanford Parser and WordNet here, and also an anaphora

resolution algorithm. During each of the five steps, the Actions previously added

to the World Model are augmented with additional information.

An important part of the algorithm presented here is the determination heuris-

tic for resolving relative references within the text (Issue 4.1). Existing libraries

are not seamlessly integrateable with the output provided by the Stanford Parser.

Therefore, we implemented a simple anaphora resolution technique for the reso-

lution of determiner and pronouns. This procedure is described in detail in [25].

An experimental evaluation using our test data set showed that this approach

achieved a good accuracy of 63.06%.

The second step in our analysis is the detection of conditional markers. These

markers can either be a single word like “if”, “then”, “meanwhile” or “other-

wise”, or a short phrase like “in the meantime” or “in parallel”. All of these

Advanced Information Systems Engineering

Search WWH ::

Custom Search

Home