Information Technology Reference
In-Depth Information
6WHS/LQNLQJ
6WHS
,QIRUPDWLRQ
$JJUHJDWLRQ
8VHU
VWRULHV
$UWLIDFWV
Fig. 5. Overview of the proposed approach
step, we classify user stories according to their status (to be implemented/not
yet started, in progress, completed) based on the artifacts found. This helps the
product owner to get a better understanding of the current status of the project
at the user story level.
In the first linking step (cf. Figure 6), the information contained in the de-
velopment artifacts is analyzed in order to discover which artifacts belong to
the realization of which user story. For instance, a code comment or a commit
message can refer to the implementation of the fancy case method of the exam-
ple user story in Figure 2 allowing to link it to the first task of the user story.
Additionally, the comments of a JUnit test can reference parts of the user story
such that the test case can be associated to the second task of this user story.
The artifacts that have tight links to the code, such as code comments or com-
mit messages, can be augmented with information derived from bug reports or
development Wiki. Also other sources of information might be exploited (which
are less structured and more distant to the code, as shown in Figure 6), such as
instant messaging (IM) within the company network or social network posts.
To make the linking step technically more concrete from the NLP perspective,
we need to reason about i) possible instance representations of the artifacts and
the user stories, and ii) possible learning mechanisms able to identify similar
objects.
For the instance representation, a first attempt might consist in applying in-
formation retrieval [10] techniques: representing the information contained in
the artifact or user story in a simple bag-of-words model in the vector space (i.e.
counting how often a word appeared in a user story, possibly weighted). If we
also want to link actual source code to user stories, then it will be also nec-
essary to identify and split source code identifiers into actual words [9]. Then,
similarity between these unstructured objects (vectors) can be calculated based
on the angle between the feature vectors in the vector space (e.g. their cosine
similarity ). Alternatively, deep natural language processing might be applied to
gather structured objects. For instance, the example user story could be represent
as shown in Figure 7, where natural language parsing and argument classifica-
tion has been applied. This representation could be further enriched with other
NLP tools like a semantic role labeler, a named entity recognizer, or distribu-
tional semantic techniques. Then, machine learning algorithms able to deal with
Search WWH ::




Custom Search