Game Development Reference
In-Depth Information
number of document covered, universal versus domain specific applicability) and
quality (tolerance to bias and errors, structural degree) of information and knowl-
edge they are able to acquire. Using these perspectives, we observe a generally high
quality and low quantity results of expert based approaches that are bound to the lim-
ited manpower. On the other hand, automated approaches deliver semantics in high
quantities but with unsure quality, since they are prone to unusual situations sourcing
from the heterogeneity of spaces they aim to cover. The crowd-based approaches are
somewhere in between, operating with numerous, yet lay mass of human contribu-
tors. They have potential for both quality and quantity, but are limited by specificness
of the task they aim to fulfill. They also need to motivate the contributors the right
way, which is also limiting. These (but not only these) issues make the field of
crowdsourcing a target for researchers.
Some researchers argue there is no other way to create accurate domain models
and annotations, than to utilize manpower, others argue that virtually any piece of
knowledge is already on the Web, probably with great redundancy and it is only
a matter of developing of the ultimate harvesting algorithm to collect it [ 18 ].
For now, the best way toward acquisition of semantics lie in combining approach
families together to exploit strong points and neutralize weaknesses. As an example
of approach chaining, we can imagine a ontology engineering project where experts
firstly set top layers of the taxonomy within the ontology, set up the axioms and
entity and relationship types and seed the examples. After this, an automated method
is deployed over the corresponding text resource corpus and extracts entities and
relationships according to patterns (previously set by expert). Lastly, the crowd comes
in to validate the acquired entities and relationships using a simple true/false question
answering interface. As another example of symbiosis, we can consider a crowd that
prepares image tags for images prior to the automated classifier training.
Considering this, we come to two possible roles of the crowd: semantics creation
or semantics validation. Whether the crowd is supposed to carry out first or the latter,
greatly influences the options the method designer has. Naturally, a “validation”
crowdsourcing always depends on an existing metadata set it aims to improve. On
the other hand it has a great advantage regarding the design of the contributor's
interface with the crowdsourcing platform: validating something is in general more
ergonomic than creating (both syntactically and semantically). In the context of
the first example, a dichotomous question answering about the validity of a typed
relationship between two terms is syntactically easier than selecting the type from a
long list. This somewhat advocates the use of crowdsourcing for semantic validation
rather creation, especially if the automated method that creates the metadata is able
to state its confidence (support) about its output, limiting the metadata set that needs
to be validated to only “unsure” cases.
The type of the resource for which the semantics is created also indicates the
potential outcome of the acquisition method. For structured and unstructured texts,
automated approaches function better if only lightweight structures are demanded
(e.g., keywords), whereas experts or crowds are needed, if the semantics (especially
domain models) is required on a higher quality grade. With multimedia, the human
work is even more demanded in semantics creation. For our research presented in
Search WWH ::




Custom Search