Game Development Reference
In-Depth Information
counts. Mass opinion on relevance was set to yes or no if one of the respective
weighted vote counts was at least twice that strong than the other one. The rest of
the pairs was set as controversial and the pair was removed from further evaluation.
Results . The results have shown that nearly 91% of the relationships in the term
network were correct, which encouraged us to further research properties of the Little
Search Game and the created network as well.
4.3.3 Ability to Retrieve “Hidden” Relationships
When considering the purpose of the Little Search Game , one may question a neces-
sity to have a human-computation approach to acquire term relationships, when
we can simply infer the relatedness from term co-occurrence (let us define the
co-occurrence of term A to B as ratio of all documents containing term A and B
to documents containing term A). Unfortunately, statistical co-occurrence of terms
does not necessarily reflect the true semantic relatedness of terms. For example, the
terms “brain” and “tumor”, which are arguably relevant to each other have ten times
lower co-occurrence as nonsense pair “substance—argument” (in the same corpus,
the Web). Many automated approaches to semantics acquisition are threatened by
some level of noise, which need to be corrected manually. In case of co-occurrence,
it renders a subset of valid (semantically sound) term relationships “hidden”, or
indistinguishable from non-valid ones.
Fortunately, the mechanics of the Little Search Game allow to explore even these
“hidden” term relationships (despite the scoring of the game itself is dependent on
the “imprecise” co-occurrence measurement). The key force which achieves this, is
the way how a regular game player thinks: although the he aims to come up with
negative search terms that have high co-occurrence with the task term, he makes
his guesses through the prism of true semantic relatedness. Therefore, he sometimes
enters terms he consider related to task, but later he realizes, they had no effect on
the result count and in next attempts, he uses them no more. However, once he used
them, they remain in the game's logs, and can eventually make it through post-hoc
filtering.
To confirm this hypothesis, we have conducted an experiment examining the term
co-occurrence for relationships present in the LSG term network acquired earlier.
Assuming the correctness of these relationships, we aimed to determine, how many
of them are “hidden”, i.e. are indistinguishable from nonsense relationships by their
co-occurrence in a corpus (i.e. the whole Web, indexed by Bing search engine).
More precisely, howmany have lesser co-occurrence than “noise level” of the corpus
(a co-occurrence value, which significant number of non-sense term relationships are
reaching) (Fig. 4.3 ).
We first used the search engine to compute co-occurrence ratios for all term pairs
in the LSG term network. We queried for number of results p s containing source term
(set A), then number of results p t containing target term (set B) and then the number
of results containing both terms i (intersection of A and B). Then, the co-occurrence
 
Search WWH ::




Custom Search