Quality Control Method in Crowdsourcing Platform for Professional Dictionary Compilation Process (PDCCP) - Process-Aware Systems

Information Technology Reference

In-Depth Information

However, due to the uncertainty of solution providers, the quality of results obtained

by crowdsourcing is hard to guarantee, which might cause further problems and

require extra work, making crowdsourcing lost its advantage of costs and efficiency.

Depending on different tasks of crowdsourcing, task quality may also influenced by

other factors, such as the difficulty of seeking the effective quality testing methods,

the unreasonable designed task model and etc.

In this paper, we introduced a crowdsourcing platform for professional dictionary

compilation (PDCCP) and mainly focused on quality control part in it. Our

contributions include: (1) proposed a specific quality testing method for quality

control in PDCCP, (2) experimented some task distribution strategies in PDCCP. We

also compared other related quality control methods in the next section and analyze

the experiment result in Section 6.

2

Related Work

2.1

Gold Standard Test

Gold standard test [3] in medical and statistic field means the best diagnostic test or

standard testing program under reasonable condition or the most accurate test in any

conditions. In crowdsourcing quality testing, companies can mix Gold Standard Data

with crowdsourcing subtasks and distribute them normally to task attendants. Within

the submitted results from test attendants, the completed quality of Gold Standard

Data can be judged directly. By comparing test attendants' submitted results with

Gold Standard Data's accomplishing results, companies can measure the general

quality of the subtask's results. To some degree, task quality testing's accuracy and

recall rate is proportional to the ratio of Gold Standard Data in the task. In addition, in

Gold Standard Test, company should not let task attendants be aware of the existence

of Gold Standard Data. Otherwise this method will not be effective.

The advantage of this quality testing method is that the algorithm is simple and

easy to achieve. If the Gold Standard Data is well designed, it can assure definite

accuracy. Nonetheless, this quality testing method highly rely on the Gold Standard

Data, which means Gold Standard Data must be prepared before the crowdsourcing

task start; thus increased task time and cost. Furthermore, seeking Gold Standard Data

for some crowdsourcing tasks is impractical such as crowdsourcing tasks involving

creativity. Therefore, the usage of Gold Standard Test is restricted.

2.2

The Expectation-Maximization Algorithm with Separation of Bias and

Errors

In 1997, Arthur Dempster, Nan Laird and Donald Rubin in their thesis [4] proposed

Expectation Maximization Algorithm that using iteration method to find unobservable

hidden variables that are important element in statistics models. Panagiotis G. Ipeirotis,

Foster Provost, and Jing Wang [5] from New York University then claim that directly

using Expectation Maximization Algorithm's application to crowdsourcing tasks is not

quite appropriate. This is because crowdsourcing task need a lot judgments and

Process-Aware Systems

Search WWH ::

Custom Search

Home