Information Technology Reference
In-Depth Information
engineering, and machine learning to implement a learning framework for annotating
cases with knowledge roles. The ultimate goal of the approach is to discover inter-
esting problem solving situations (hereafter simply referred to as cases) that can be
used by an experience management system to support new engineers during their
working activities. However, as an immediate benefit, the annotations facilitate the
retrieval of cases on demand, allow the collection of empirical domain knowledge,
and can be formalized with the help of an ontology to also permit reasoning. The ex-
perimental results presented in the chapter are based on a collection of 500 Microsoft
Word documents written in German, amounting to about one million words. Several
processing steps were required to achieve the goal of case annotation. In particular,
we had to (a) transform the documents into an XML format, (b) extract paragraphs
belonging to cases, (c) perform part-of-speech tagging, (d) perform syntactical pars-
ing, (e) transform the results into XML representation for manual annotation, (f)
construct features for the learning algorithm, and (g) implement an active learning
strategy. Experimental results demonstrate the feasibility of the learning approach
and a high quality of the resulting annotations.
The chapter is organized as follows. In Section 4.2 we describe our domain of
interest, the related collection of documents, and how knowledge roles can be used
to annotate text. In Section 4.3 we consider work in natural language processing,
especially frame semantics and semantic role labeling, emphasizing parallels to our
task and identifying how resources and tools from these domains can be applied to
perform annotation. Section 4.4 describes in detail all the preparatory steps for the
process of learning to annotate cases. Section 4.5 evaluates the results of learning.
Section 4.6 concludes the chapter and outlines areas of future work.
4.2 Domain Knowledge and Knowledge Roles
4.2.1 Domain Knowledge
Our domain of interest is predictive maintenance in the field of power engineering,
more specifically, the maintenance of insulation systems of high-voltage rotating
electrical machines. Since in many domains it is prohibitive to allow faults that could
result in a breakdown of the system, components of the system are periodically
or continuously monitored to look for changes in the expected behavior, in order
to undertake predictive maintenance actions when necessary. Usually, the findings
related to the predictive maintenance process are documented in several forms: the
measured values in a relational database; the evaluations of measurements/tests
in diagnostic reports written in natural language; or the recognized symptoms in
photographs. The focus of the work described here are the textual diagnostic reports.
In the domain of predictive maintenance, two parties are involved: the service
provider (the company that has the know-how to perform diagnostic procedures and
recommend predictive maintenance actions) and the customer (the operator of the
machine). As part of their business agreement, the service provider submits to the
customer an o cial diagnostic report . Such a report follows a predefined structure
template and is written in syntactically correct and parsimonious language. In our
case, the language is German.
A report is organized into many sections: summary, reason for the inspection,
data of the inspected machine, list of performed tests and measurements, evaluations
Search WWH ::




Custom Search