Database Reference
In-Depth Information
been a subject of widespread debate. For example, in late 2005, American journalist
John Seigenthaler publicly criticized Wikipedia because of a collection of inac-
curacies in his biography page, including an assertion that he was involved in the
assassination of former U.S. President John F. Kennedy.
2
Apparently the inaccura-
cies remained in Wikipedia for 132 days. Because there is no single entity taking
responsibility for the accuracy of Wikipedia content, and because users have no
other way of differentiating accurate content from inaccurate content, it is com-
monly thought that Wikipedia content cannot be relied upon, even if inaccuracies
are rare [
35
].
To overcome this weakness, Wikipedia has developed several user-driven
approaches for evaluating the quality of its articles. For example, some articles
are marked as “featured articles.” Featured articles are considered to be the best
articles in Wikipedia, as determined by Wikipedia's editors. Before being listed
here, articles are reviewed as “featured article candidates,” according to special
criteria that take into account: accuracy, neutrality, completeness, and style.
3
In
addition, Wikipedia users keep track of articles that have undergone repeated
vandalism in order to eliminate it and report it.
4
However, these user-driven
approaches cannot be scaled and only a small number of Wikipedia articles are
evaluated in this way. For example, as of March 2010, only 2,825 articles (less than
0.1%) in English Wikipedia are marked as featured. Another difficulty of the user-
driven evaluations is that Wikipedia content is, by its nature, highly dynamic and
the evaluations often become obsolete rather quickly.
Due to these conditions, recent research work involves automatic quality analy-
sis of Wikipedia [
33
,
35
-
43
]. Cross [
35
] proposes a system of text coloring
according to the age of the assertions in a particular article; this enables Wikipedia
users to see what assertions in an article have survived after several edits of the
article and what assertions are relatively recent and thus, perhaps, less reliable.
Adler et al. [
37
] quantify the reputation of users according to the survival of their
edit actions; then they specify ownerships of different parts of the text. Finally,
based on the reputation of the user, they estimate the trustworthiness of each word.
Javanmardi et al. in [
36
] present a robust reputation model for wiki users and show
that it is not only simpler but also more precise compared to the previous work.
Other research methods try to assess the quality of a Wikipedia article in its
entirety. Lih [
40
] shows that there is a positive correlation between the quality of an
article and the number of editors as well as the number of revisions. Liu et al. [
33
]
present three models for ranking Wikipedia articles according to their level of
accuracy. The models are based on the length of the article, the total number of
revisions, and the reputation of the authors, who are further evaluated by their total
number of previous edits. Zeng et al. [
42
] compute the quality of a particular article
revision with a Bayesian network from the reputation of its author, the number of
2
http://bit.ly/4Bmrhz
3
http://en.wikipedia.org/wiki/Wikipedia:Featured_articles
4
http://bit.ly/dy3t1Y
Search WWH ::
Custom Search