Database Reference
In-Depth Information
words the author has changed, and the quality score of the previous version. They
categorize users into several groups and assign a static reputation value to each
group, ignoring individual user behavior.
Stvilia et al. [ 41 ] have constructed seven complex metrics using a combination
of them for quality measurement. Dondio et al. [ 39 ] have derived ten metrics from
research related to collaboration in order to predict quality. Blumenstock [ 38 ]
investigates over 100 partial simple metrics, for example, the number of words,
characters, sentences, and internal and external links. He evaluates the metrics by
using them for classifications between featured and nonfeatured articles. Zeng et al.,
Stvilia et al., and Dondio et al. used a similar method that enables the evaluation
results to be compared. Blumenstock demonstrates, with an accuracy of classifica-
tion of 97%, that the number of words is the best current metric for distinguishing
between featured and nonfeatured articles. These works assume that featured
articles are of much higher quality than nonfeatured articles and recast the problem
as a classification issue. Wohner and Peters [ 43 ] suggest that, with improved
evaluation methods, these metrics-based studies enable us to determine the accu-
racy of various submissions. Studying German Wikipedia, they believe that a
significant number of nonfeatured articles are also highly accurate and reliable.
However, this category includes a large number of short articles. Their study of
German Wikipedia from January 2008 shows that about 50% of the articles contain
lower than 500 characters, and thereby they assume that some short nonfeatured
articles are of high quality, since their subject matter can be briefly but precisely
explained.
In addition, we and others [ 43 , 44 ] assume that when an article is marked as
featured and is displayed on the respective pages, it attracts many more Web users
for contributions and demands more administrative maintenance. Wohners and
Peters' investigation of German Wikipedia [ 43 ] reveals this assumption to be
true. For example, over 95% of all articles are edited with greater intensity, once
they are marked as featured. Wilkinson and Huberman [ 44 ], in a similar study of
English Wikipedia, show that featured articles gain an increase in the number of
edits and editors after being marked as featured. According to these observations,
the accuracy of the classification in the related work [ 39 , 41 , 42 ] will be valid only if
featured articles are considered before they are marked as featured.
14.3 Modeling User Reputation
The long-term goal of this effort is to develop an automated system that can
estimate the reputation R i ( t ) of a Wikipedia user i at time t based on his/her past
behavior. The reputation index R i ( t ) should be positive and scaled between 0 and 1
and, for the moment, should be loosely interpretable as the probability that i
produces high-quality content. Here, we take a first step toward this long-term
goal by developing several computational models of R i ( t ) and testing them, in the
Search WWH ::




Custom Search