Robust Requirements Analysis in Complex Systems through Machine Learning - Trustworthy Eternal Systems via Evolving, Software Data and Knowledge

Information Technology Reference

In-Depth Information

system equipment . Once the template is given, the text fragments containing rel-

evant information to fill the template slots (i.e. specific values associated to the

attributes of a certain Ship instance) need to be identified in a text. The recog-

nition of textual information of interest results from pattern matching against

extraction rules. Finally, in a third phase, whenever the information of interest

is identified in the text, its mapping in the suitable (e.g. Ship ) template slot is

carried out. The above chain is not trivial and contemporary IE systems 1 are

usually integrated with large scale knowledge bases, determining all the lexical,

syntactic and semantic constraints needed for a correct interpretation of usually

domain-specific texts. Unfortunately, the manual development of these resources

is a time-consuming task that is often highly error-prone due to the subjectivity

and intrinsic vagueness that affects the semantic modeling process. Knowledge

acquisition task is often approached through the use of Machine Learning algo-

rithms to automatically learn the domain-specific information from annotated

data [9]. Statistical learning methods [10] assume that lexical or grammatical as-

pects of training data are the basic features for modeling the different inferences.

They are then generalized into predictive patterns composing the final induced

model. A statistical language processor is assumed to be able to locate specific

instances of a template type (e.g. Ship ) and their slot information in an incom-

ing text. The resulting instantiated template can be employed to populate an

existing knowledge base whose semantic schema correspond (or can be mapped)

to the template structure. Moreover, reasoning over the extracted information,

e.g. identifying relations or dependencies with respect to previous requirements,

can be better performed. For example, retrieval of developed components that

respond properly to new requirements could be realized as a form of reasoning.

3 Machine Learning for Requirement Analysis

In Requirement Analysis some NLP applications like Information Extraction

tasks could be very useful to support people to perform this task practically

and in a cost-effective way. Statistical NLP approaches provide domain specific

models of target interpretation tasks by acquiring and generalizing linguistic ob-

servations. Several Statistical Machine Learning paradigms have been defined to

provide robust models that easily adapt across different (and possibly specific)

domains. These techniques are the basis to our proposed approach and we will

discuss them hereafter. This problem is normally treated as a Statistical Classi-

fication problem, where the target is to identifying the sub-population to which

new data belong, where the identity of the sub-population is unknown (the test

data), on the basis of a training set of data containing observations whose sub-

population is known (the training data). In this scenario we may be interested

for example to induce a template slot for a candidate text. Support Vector Ma-

chine (SVM), as discussed in [11] and [12], represents one of the most known

learning paradigm for classification, based on Statistical Learning Theory. Given

training instances, each one associated with a class and a set of “features”, i.e.

1 OpenCalais: http://viewer.opencalais.com/

Trustworthy Eternal Systems via Evolving, Software Data and Knowledge

Search WWH ::

Custom Search

Home