Information Technology Reference
In-Depth Information
and edges within the generated models. We can see that the transformation
procedure tends to produce models which are on average 9-15% larger in size
then what a human would create. This can be partially explained by noise and
meta sentences which were not filtered appropriately. On the other hand, humans
tend to abstract during the process of modeling. Therefore, we often find more
detail of the text also in the generated model. The results are highly encouraging
as our approach is able to correctly recreate 77% of the model in average. On
a model level up to 96% of similarity can be reached, which means that only
minor corrections by a human modeler are required.
During the detailed analysis we determined different sources of failure, which
resulted in a decreased metric value. These are noise, different levels of abstrac-
tions, and processing problems within our system. Noise includes sentences or
phrases that are not part of the process description, as for instance “This ob-
ject consists of data elements such as the customers name and address and the
assigned power gauge.” While such information can be important for the under-
standing of a process, it leads to unwanted Activities within the generated model.
To tackle this problem, further filtering mechanisms are required. Low similarity
also results from difference in the level of granularity . To solve this problem, we
could apply automated abstraction techniques like [34] on the generated model.
Finally, the employed natural language processing components failed during the
analysis. At stages, the Stanford Parser failed at correctly classifying verbs. For
instance, the parser classified “the second activity checks and configures” as a
noun phrase, such that the verbs “check” and “configure” cannot be extracted
into Actions. Furthermore, important verbs related to business processes are
not contained in FrameNet, as “report”. Therefore, no message flow is created
between report activities and a Black Box Pool. We expect this problem to
be solved in the future as the FrameNet database grows. With WordNet, for
instance, there is a problem with times like “2:00 pm”, where pm as an abbre-
viation for “Prime Minister” is classified as an Actor. To solve this problem a
reliable sense disambiguation has to be conducted. Nevertheless, overall good
results were achieved by using WordNet as a general purpose Ontology.
5 Related Work
Recently, there is an increasing interest in the derivation of conceptual models
from text. This research is mainly conducted by six different groups.
Two approaches generate UML models. The Klagenfurt Conceptual Pre-design
Model and a corresponding tool are used to parse German text and fill instances
of a generic meta-model [35]. The stored information can be transformed to UML
activity diagrams and class diagrams [18]. The transformation from text to the
meta-model requires the user to make decisions about the relevant parts of a
sentence. In contrast to that, the approach described in [36] is fully automated.
It uses use-case descriptions in a format called RUCM to generate activity di-
agrams and class diagrams [17]. Yet, the system is not able to parse free-text.
The RUCM input is required to be in a restricted format allowing only 26 types
 
Search WWH ::




Custom Search