Trust in Online Collaborative IS - Community-Built Databases: Research and Development

Database Reference

In-Depth Information

been a subject of widespread debate. For example, in late 2005, American journalist

John Seigenthaler publicly criticized Wikipedia because of a collection of inac-

curacies in his biography page, including an assertion that he was involved in the

assassination of former U.S. President John F. Kennedy. 2 Apparently the inaccura-

cies remained in Wikipedia for 132 days. Because there is no single entity taking

responsibility for the accuracy of Wikipedia content, and because users have no

other way of differentiating accurate content from inaccurate content, it is com-

monly thought that Wikipedia content cannot be relied upon, even if inaccuracies

are rare [ 35 ].

To overcome this weakness, Wikipedia has developed several user-driven

approaches for evaluating the quality of its articles. For example, some articles

are marked as “featured articles.” Featured articles are considered to be the best

articles in Wikipedia, as determined by Wikipedia's editors. Before being listed

here, articles are reviewed as “featured article candidates,” according to special

criteria that take into account: accuracy, neutrality, completeness, and style. 3 In

addition, Wikipedia users keep track of articles that have undergone repeated

vandalism in order to eliminate it and report it. 4 However, these user-driven

approaches cannot be scaled and only a small number of Wikipedia articles are

evaluated in this way. For example, as of March 2010, only 2,825 articles (less than

0.1%) in English Wikipedia are marked as featured. Another difficulty of the user-

driven evaluations is that Wikipedia content is, by its nature, highly dynamic and

the evaluations often become obsolete rather quickly.

Due to these conditions, recent research work involves automatic quality analy-

sis of Wikipedia [ 33 , 35 - 43 ]. Cross [ 35 ] proposes a system of text coloring

according to the age of the assertions in a particular article; this enables Wikipedia

users to see what assertions in an article have survived after several edits of the

article and what assertions are relatively recent and thus, perhaps, less reliable.

Adler et al. [ 37 ] quantify the reputation of users according to the survival of their

edit actions; then they specify ownerships of different parts of the text. Finally,

based on the reputation of the user, they estimate the trustworthiness of each word.

Javanmardi et al. in [ 36 ] present a robust reputation model for wiki users and show

that it is not only simpler but also more precise compared to the previous work.

Other research methods try to assess the quality of a Wikipedia article in its

entirety. Lih [ 40 ] shows that there is a positive correlation between the quality of an

article and the number of editors as well as the number of revisions. Liu et al. [ 33 ]

present three models for ranking Wikipedia articles according to their level of

accuracy. The models are based on the length of the article, the total number of

revisions, and the reputation of the authors, who are further evaluated by their total

number of previous edits. Zeng et al. [ 42 ] compute the quality of a particular article

revision with a Bayesian network from the reputation of its author, the number of

2

http://bit.ly/4Bmrhz

3

http://en.wikipedia.org/wiki/Wikipedia:Featured_articles

4 http://bit.ly/dy3t1Y

Community-Built Databases: Research and Development

Search WWH ::

Custom Search

Home