Information Technology Reference
In-Depth Information
Semantic tagging [11, 28] assigns a word or multiword expression to a specific
class of meaning. The semantic tags are represented in a tagset arranged in 21 top-
level categories (e.g., M and S in Table 1) that expand into 232 sub-categories (e.g.,
M3 and S7.4) [28]. Each of these categories groups words that are related via a
specific meaning (e.g., M3 contains vehicle, car, bus, truck, automobile). The
taxonomy originates from a corpus-based dictionary and has been comparatively
evaluated against publicly available semantic hierarchies [32].
Moreover, the same word (e.g., performance) can contain different meanings (e.g.,
act of a dancer or artist; processing power of the computer) and thus be present in
more than one semantic category. The semantic tagger deals with this by analyzing
the context of the phrase in which the word is used and also by using POS tags for
disambiguation in order to attribute the correct tag. It is important to highlight that
both POS and semantic tagging are completely handled by WMATRIX and do not
require any input neither from the requirements engineer nor from the EA-Miner tool.
The semantic tagger makes its decisions based on a large coverage dictionary of
single words and multiword expressions, currently containing 73,894 words and
multiwords. These resources have been constructed manually by linguists for other
corpus based projects over a number of years.
EA-Miner utilizes WMATRIX to pre-process a requirements document provided
as input (Fig. 2). WMATRIX returns another file which consists of the same content
as the input file but tagged with POS and SEM tags. Figure 2a shows an example of a
toll collection system we will use throughout this paper. Figure 2b shows the first
“In a road traff ic pricing system, drivers of authorized vehicles are charged at toll gates
automatically. The gates are placed at special lanes called green lanes. A driver has to
install a device (a gizmo) in his/her vehicle. The registration of authorised vehicles
includes the owner's personal data, bank account number and vehicle details. The
gizmo is sent to the client to be activated using an ATM that informs the system upon
gizmo activation. A gizmo is read by the toll gate sensors. The information read is
stored by the system and used to debit the respective account. When an authorised
vehicle passes through a green lane, a green light is turned on,and the amount being
debited is displayed. If an unauthorisedvehicle passes through it, a yellow light is
turned on and a camera takes a photo of the plate (used to fine the owner of the
vehicle). There are three types of toll gates: single toll, where the same type of vehicles
pay a fixed amount, entry toll to enter a freeway and exit toll to leave it. The amount
paid on motorways depends on the type of the vehicle and the distance traveled.”
“In a road traffic pricing system, drivers of authorized vehicles are charged at toll gates
automatically. The gates are placed at special lanes called green lanes. A driver has to
install a device (a gizmo) in his/her vehicle. The registration of authorised vehicles
includes the owner's personal data, bank account number and vehicle details. The
gizmo is sent to the client to be activated using an ATM that informs the system upon
gizmo activation. A gizmo is read by the toll gate sensors. The information read is
stored by the system and used to debit the respective account. When an authorised
vehicle passes through a green lane, a green light is turned on,and the amount being
debited is displayed. If an unauthorisedvehicle passes through it, a yellow light is
turned on and a camera takes a photo of the plate (used to fine the owner of the
vehicle). There are three types of toll gates: single toll, where the same type of vehicles
pay a fixed amount, entry toll to enter a freeway and exit toll to leave it. The amount
paid on motorways depends on the type of the vehicle and the distance traveled.”
a
a
b
b
Fig. 2. a Toll collection system adapted from [16, 17]. b First sentence of the toll system file
with POS and SEM tags. The parsed file is structured in sentences <s> containing words <w>.
Search WWH ::




Custom Search