Assessing Quality Score of Wikipedia Articles using Mutual Evaluation of Editors and Texts Yu Suzuki Graduate School of Information Science, Nagoya University Furo, Chikusa, Nagoya, Aichi 4648603, Japan suzuki@db.ss.is.nagoya-u.ac.jp Masatoshi Yoshikawa Graduate School of Informatics, Kyoto University Yoshida-Honmachi, Sakyo, Kyoto 6068501, Japan yoshikawa@i.kyoto-u.ac.jp ABSTRACT In this paper, we propose a method for assessing quality scores of Wikipedia articles by mutually evaluating editors and texts. Survival ratio based approach is a major ap- proach to assessing article quality. In this approach, when a text survives beyond multiple edits, the text is assessed as good quality, because poor quality texts have a high proba- bility of being deleted by editors. However, many vandals, low quality editors, delete good quality texts frequently, which improperly decreases the survival ratios of good qual- ity texts. As a result, many good quality texts are unfairly assessed as poor quality. In our method, we consider editor quality score for calculating text quality score, and decrease the impact on text quality by vandals. Using this improve- ment, the accuracy of the text quality score should be im- proved. However, an inherent problem with this idea is that the editor quality scores are calculated by the text quality scores. To solve this problem, we mutually calculate the edi- tor and text quality scores until they converge. In this paper, we prove that the text quality score converges. We did our experimental evaluation, and conﬁrmed that our proposed method could accurately assess the text quality scores. Categories and Subject Descriptors H.1.2 [Models and Principals]: User/Machine Systems Keywords Wikipedia; Quality; Peer Review; Vandalism; Edit History 1. INTRODUCTION Wikipedia 1 is a famous Internet encyclopedia, and is one of the most successful and well-known User Generated Con- tent (UGC) websites. Any user can edit any article, Wikipedia has more and fresher information than existing paper-based encyclopedias. Many experts submit texts in Wikipedia, 1 http://www.wikipedia.org/ Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. Request permissions from permissions@acm.org. CIKM’13, Oct. 27–Nov. 1, 2013, San Francisco, CA, USA. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2263-8/13/10 ...$15.00. http://dx.doi.org/10.1145/2505515.2505610. and the texts should be informative for readers. However, due to huge number of Wikipedia articles, many texts are not reviewed by experts, so the number of poor quality texts has also dramatically increased. On the other hand, many readers cannot easily identify texts which are good qual- ity or not, because not all readers are experts. Therefore, there is a need for automatically identifying which articles in Wikipedia are good quality or not. In this paper, we use the survival ratio based approach for calculating text quality scores, which is one of the major approaches for measuring text quality scores[3]. We measure the number of times which editors decide the text should remain, which is a key idea of survival based approach. If many readers feel excellence for a text, the quality of this text is good, but if many readers feel that a text should be removed, the quality of this text is poor. Adler et al. [2] found that 79% of poor quality texts are short-lived. We can estimate from this result that if editors ﬁnd poor quality texts, many editors remove them. They assumed that the quality of article becomes good according to the number of edits, because all editors delete only poor quality texts. However, this assumption is not always true because of edits by vandals, because these van- dals delete not only poor quality texts but also good quality texts. If vandals delete a text, the survival ratio of the text is overly decreased. To avoid the eﬀects by vandals, we need to detect which editors are vandals and which are not, and re- adjust survival ratios of texts in accordance with the editor quality scores. However, the editor quality score is calcu- lated by the text quality score, and the text quality score is calculated by the editor quality score. Therefore, calculat- ing the text quality score using the editor quality score is the chicken-or-egg problem. To solve this problem, we propose a method for mutually calculating text quality scores using both survival ratios of texts and editor quality scores. We deﬁne an editor quality score as the average text quality scores written by the editor. However, text quality scores are calculated on the basis of editor quality scores. In short, one quality score is calculated by another quality score. Therefore, it is hard to calculate the text quality scores using editor quality scores. To solve this problem, we ﬁrst set editor quality scores as constant values and calculate text quality scores. Next, we calculate the editor quality scores by using text quality scores. Again, we calculate the text quality scores using the editor quality scores. In this way, we mutually calculate editor and text quality scores. Using this method, we can calculate a text 1727