Information Technology Reference
In-Depth Information
In this evaluation, we considered a set of 290 requisites and 1,474 functionali-
ties of the Combat Management System (CMS). The mean number of function-
alities for each r was 5, with a standard deviation of about 3.5. For each r i ,the
set F i specifies all functionalities realizing r i : these are the gold standard, i.e.
the set of texts expected to be retrieved by the analyst querying by r i .Asthey
are short texts, individual r i as well as f j are modeled according to the Com-
prehensive model, i.e. the BoW + N-POS + N-Words vector representation, as
it achieves the best results in the RA discussed in Section 4.1.
The information acquired during the RA phase is here exploited in order to
define a Re-ranking phase: the ranking provided by the semantic similarity func-
tion is thus adjusted to filter out all those functionalities that do not share the
same characterization of the target requirement r i , i.e. the same type and capa-
bility , as discussed in Section 4.1. Four different retrieval strategies are applied,
giving rise to four IR systems:
- NoFilter :foreach r i , the most similar f j are retrieved and ranked according
to sim : no filter is applied.
-Type : the ranking provided by sim is grouped in two lists: the first, ranked
higher, is made by functionalities sharing the same type of r i and a second list
including the remaining f j whose type is different. In this way functionalities
f j of the same type of r i are always ranked first than the other ones.
- Capability : the two lists are created as before with respect to the capability
assigned to the target r i , so that functionalities with the same capabilities
of r i are ranked first;
- Type+Capability : the ranking provided by sim is modified as before ac-
cording to the sharing the both type and capability of r i .
Different strategies are evaluated according to standard IR evaluation metrics:
Precision ( P ), Recall ( R ), F-measure ( F 1) and Mean Average Precision ( MAP ).
Precision is expressed as P =
tp
tp + fp ,where tp is the number of the relevant
functionalities retrieved, and fp is the number of the not relevant functionalities
retrieved. Recall is expressed as R = tp
tp + fn ,where fn is the number of the
relevant functionalities not retrieved. While Precision estimates the capacity to
retrieve correct functionalities, Recall is more interesting in this scenario as it
measures system capacity to retrieve all existing functionalities; in many cases,
it is more important to retrieve all existing software instead of spending more
time reading useless documentation. F-measure consider both aspects as it is
estimated as the harmonic mean of Precision and Recall: F 1= 2 ·P·R
P + R
Finally, MAP provides a single accuracy measure across different recall levels.
MAP is based on the oracle given by RF =
that are pairs of a requisite
r i and a functionality set F i . Every requisite r i also corresponds to a ranked list
of retrieved functionalities, ordered according to the similarity function sim .Let
F i be the list of retrieved functionalities f j from the top result (i.e. f 1 ,ranked
as the closest by the system) to the f k that corresponds to the position where
k -th members of the functionalities in F i results all returned. In this way, the
{r i ,F i }
 
Search WWH ::




Custom Search