Information Technology Reference
In-Depth Information
sentence of the file parsed by WMATRIX. One can notice, for example, that the word
“road” has POS tag = NN1 which represents a singular noun and “traffic” has the
SEM tag = M3 which represents the vehicles and transport on land semantic class.
Another important concept that is utilized in EA-Miner is Stemming [11, 28] which
is the process of reducing similar words to the same canonical form. For example the
words “availability” and “available” are stemmed to “avail”. This makes it possible to
recognize words that are not exactly equal and treat them as the same concept (e.g.,
for a requirements engineer the words “driver” and “drivers” are the same viewpoint).
3.1 Using NLP Techniques
The NLP techniques offered by WMATRIX are used by EA-Miner to automate
construction and transformation of AORE models represented as different
requirements artifacts. Fig. 3 presents an overview of the transformations that
transform an initial set of elicitation level RE documents (e.g., user manuals, legacy
specifications, interview transcripts) called model M0 into a structured AORE Model
M1 and later into a filtered M2 model.
M1
M1
M2
M2
M0
MM0
Elicitation level
RE documents
Elicitation level
RE documents
Structured AORE
Model
Structured AORE
Model
Filtered AORE
Model
Filtered AORE
Model
Fig. 3. Model transformations
The automation level discussed here focuses mainly on describing NLP-based
mining analysis to identify the concepts (e.g., use cases, viewpoints, early aspects)
(Sect. 3.1.1) within an AORE approach. Moreover, we also show how, after the
model is created, some filtering techniques (Sect. 3.1.2) can be used to discard
concepts that were improperly identified or to merge elements that represent the same
concept but were identified as different concepts (e.g., “driver” and “drivers” as
different viewpoints as mentioned above).
3.1.1 Identification of Model Concepts
The identification of base concerns in some AORE approaches (e.g., viewpoint-
based [16, 17] or use case based [22]) can use the information produced by the POS
tagging to list possible candidates. For example, viewpoint candidates can be words
whose POS tag represents a noun (POS tag = NN1 or NN2). For example, Fig. 2b
shows that “road”, “drivers”, “vehicles” and “gates” are viewpoint candidates since
their POS tag represents a noun. Similarly, candidates for use cases can be words
whose POS tag represents an action verb (auxiliary verbs can be ignored such as
“are”, “can”) such as the word “charged” (POS = VVN) in Fig. 2b.
One problem with this approach, also mentioned in [33] that suggests looking at
noun-phrases to identify objects during object-oriented analysis, is that the initial list
can contain a lot of false positives especially when the input file is large. This is why
it is important to provide tool support and methodological guidance so that the
engineer can prune the list to arrive at a set of good candidates. Section 3.1.2 provides
Search WWH ::




Custom Search