Game Development Reference
In-Depth Information
￿
The bootstrapping technique relies on initial retrieval of large set of potential
occurrences of the given seed patterns (which are a phrase stubs). The retrieval
is done through a web search engine (or similar engine working over some other
corpus).
￿
With a necessary preprocessing (trimming away the HTML, fragments of texts),
the candidate sentences are prepared. Not all of them semantically match the start-
up relationship, e.g., “wheel of the car” is correct while “house of representatives”
is incorrect relation instance to “part of” relation.
￿
However, if a certain couple of subject-object (features) is recurring with different
seeds, the features are arguably the in the given relationship (in this example, we
have of course suppressed the algorithm of feature (entity) recognition).
Another and yet similar “predicate-oriented” approach was presented by Sanchez
and Moreno [ 59 , 60 ] who focused on exploration of non-taxonomic relationships
which are insufficiently present in ontologies. It extracts domain-related verbs first
and afterward tries to acquire their occurrences in the Web (access through search
engine, using verb phrases learned from small domain-related corpus).
There is also an interesting work of Weichselbraun et al., which focuses on label-
ing (i.e. assigns types or names) of the already existing relationships (also stressing
non-taxonomic relationships) [ 71 ]. The method mines the corpus of texts looking
for co-occurrence of entities coupled in unlabeled relationship and looks up for can-
didate predicates. The process is, however, supervised by two ontologies: (1) which
contains a predefined, finite set of possible relationship labels (domain-related),
(2) which contains a taxonomy of all the entities involved in the unlabeled rela-
tionships. The purpose of the second ontology is to provide additional constraints
that are defined on abstract layers of the ontology and thus have to be valid for lower
levels too (which effectively means that not all verbs can be assigned as labels to
certain relationships, even if they are found by text mining as candidates).
It is also necessary to mention lightweight semantics acquisition. Typically, latent
semantic analysis is used as a “generalized vector space method that uses dimension
reduction to generate term correlations” [ 53 ]. These correlations or co-occurrences
of terms form a network of related terms, if we adopt the premise that if cer-
tain terms occurs together often, they are somehow semantically related (although
we cannot name the relationship). But even such lightweight semantics are usable
(e.g., for query expansion).Moreover unnamed relationships can always be processed
by naming approaches and promoted to full triplets.
2.4.3 Automated Multimedia Description Acquisition
Despite their heterogeneous nature in terms of quality, automated metadata acquisi-
tion approaches are generally used for annotation of large resource collections. As
first major group, we take the image description acquisition approaches.
 
Search WWH ::




Custom Search