Game Development Reference
In-Depth Information
candidate instances for a class, also given on input. More detailed, the OntoSyphon
worksasfollows:
1. The textual representation of the input class is retrieved from the ontology.
2. Textual phrases and sentences where the input class is used are retrieved from
ontology (if present). Alternatively the ontology neighbors of the input class
(parent classes, siblings in hierarchy, other related classes) are retrieved to form
artificial phrases.
3. These phrases are used as queries for keyword search engine operating over the
given document corpus (here, the whole Web can be easily used).
4. The search engine retrieves a set of documents to be mined. Because of the phrase
use, only documents with proper termmeanings are retrieved. Without the phrase
search, i.e. with only a keyword search with textual representation of the input
class, the system might encounter polysemy problems. If, for example, there was
a class “sea”, by querying it, we would retrieve documents about sea as a part of
ocean, but also about SEA information system. But if the phrase is derived out of
the existing ontology (e.g., “sea ship”) and used as a query, much more coherent
set of documents with a proper term meaning usage would be retrieved.
5. Finally, using the predefined set of sentence templates (e.g., “A is a B” or “A such
as B, C, D”), the OntoSyphon matches the texts of the retrieved documents for
expressions of the hierarchical subordination of the named entities, with input
class being the superior entity. The other participating entities are afterward writ-
ten to a instance candidate list.
2.4.2 Relationship Discovery and Naming
Another group of automated semantics acquisition approaches orients on the dis-
covery relationships between entities. The entities can be anything from the simple
terms to refined ontology concepts. In all cases, textual representations of entities are
sought in the textual resources and subsequently their relationships are mined. The
factual statements are often contained within the single sentence as subject, object
(nouns, adjectives) and predicate (verbs), so many approaches focus on mining the
sentences for term relationships [ 51 , 60 , 71 ]. Others try to exploit structures like
tables and lists to access the relationship expressed through them [ 15 ].
An example of relationship harvesting was presented by Pantel and Pennacchiotti
[ 51 ]. Their approach implemented a bootstrapping technique, which is, when sup-
plied by few examples, able to harvest quality relationships from the natural language
text corpus, even the whole Web. The approach is predicate-oriented: it primarily
looks for relationship (predicate) occurrence in the corpus and only afterward, it
attaches the subjects and objects to it. The method works as follows:
￿
At start-up a small set of seed expressions of the same relationship is chosen,
e.g., “part of”, “consists of”, “comprises”. Its generic pattern is created to cover
variations of the expression, e.g., “X of Y”.
 
Search WWH ::




Custom Search