Information Technology Reference
consideration from task attendants. Due to their various knowledge levels, abilities,
experiences or other aspects, the submitted results for crowdsourcing task have
judgment bias. They suggested that we should distinct (unrecoverable) errors from
(recoverable) bias and then proposed an Expectation Maximization Algorithm with
separation of bias and errors especially for crowdsourcing task.
Error means a task attendant submitted meaningless and false task results. Its
accuracy is quite the same as computer randomly submitted task results. These are
mostly from deceptive task attendants. For instance, submitting same answer for all
task questions or randomly choosing answers for each question. Bias means a task
attendant's submitted biased task results due to some specified reasons. By some
methods, company can effectively use these biased information and improve the
accuracy of quality testing. Factors causing bias are different perceptions of people's
personal standard, spiteful inverse answer submitted etc.
Therefore, perfect task attendants and spiteful attackers with simple strategy are
both valuable since their revising cost is approximately 0; however, for unified
submitters or random submitters, their revising cost is relatively high. Thus, it's not
necessary for us to expect task attendants have very high accuracy. As long as task
attendant's revising cost is low enough and its bias is predictable and revisable, then
those attendants' results can be accepted.
In some online crowdsourcing tasks, there exists task attendants that are seriously
contribute their efforts but receive low quality grade for their submitted results due to
the disregard of bias from errors. These actually high quality task attendants are
alienated from or even quit in the middle of some crowdsourcing tasks that simply
using mode for judgment. Expectation Maximization Algorithm with Separation of
Bias and Errors can effectively solve this situation.
This algorithm is completed entirely by computer and it does not need prepare any
extra information. In addition, it can obtain task quality and distinguish bias among
task attendants at the same time. However, this algorithm has following limitations:
(1) Quality testing results are mostly depend on quality of the task attendants, if the
proportion of low quality attendants are relatively large, algorithm results do not work
well. This usually appears in small-scale crowdsourcing task. (2) Algorithm itself is
complicated and impracticable for crowdsourcing tasks involving creativity.
Overview of PDCCP
In this Crowdsourcing Platform for Professional Dictionary Compilation (PDCCP),
the crowdsourcing task is compiling a cloud-computing dictionary. We extracted
cloud computing related words and phrases from large amount of references and
composed a cloud computing words database. Next, we create a crowdsourcing online
translation platform, and let every Internet user who can visit the online translation
web to attend this crowdsourcing task.
Due to the data in words database are all professional terms of cloud computing, or
even the most recent artificial words, abbreviations and phrases appeared in
references, translation work and translation result examine are all hard for computer