Creative Education 2012. Vol.3, No.8, 1320-1325 Published Online December 2012 in SciRes (http://www.SciRP.org/journal/ce) http://dx.doi.org/10.4236/ce.2012.38193 Copyright © 2012 SciRes. 1320 Building a Better Mousetrap: Replacing Subjective Writing Rubrics with More Empirically-Sound Alternatives for EFL Learners Andrew D. Schenck, Eoin Daly * English Education, Department of Liberal Arts Education (LAEC), Ju Si-Gyeong College, Pai Chai University, Daejeon, South Korea Email: Schenck@hotmail.com, * eointeacher@yahoo.com Received October 2 nd , 2012; revised November 5 th , 2012; accepted November 9 th , 2012 Although writing rubrics can provide valuable feedback, the criteria they use are often subjective, which compels raters to employ their own tacit biases. The purpose of this study is to see if discreet empirical characteristics of texts can be used in lieu of the rubric to objectively assess the writing quality of EFL learners. The academic paragraphs of 38 participants were evaluated according to several empirically calculable criteria related to cohesion, content, and grammar. Values were then compared to scores ob- tained from holistic scoring by multiple raters using a multiple regression formula. The resulting correla- tion between variables (R = .873) was highly significant, suggesting that more empirical, impartial means of writing evaluation can now be used in conjunction with technology to provide student feedback and teacher training. Keywords: Writing Rubrics; Writing Evaluation; Cohesion; Grammar; Word Frequency Introduction Several studies recognize the efficacy of the rubric as a means to score writing and provide feedback (Cope, Kalantzis, McCarthey, Vojak, & Kline, 2011; Mansilla, Duraisingh, Wolfe, & Haynes, 2009; Peden & Carroll, 2008). A study by Beyreli and Ari (2009), for example, found that it could be accurately used to assess ten properties related to structure, language, and organization with a fair degree of inter-rater reliability (from 65% to 81%). Another study revealed that it could be used to evaluate writing holistically, regardless of the participants’ L1 (Sévigny, Savard, & Beaudoin, 2009). Recent adaptations of the rubric have even discovered the potential to increase forma- tive feedback through the use of both technology and self-as- sessment strategies (Cope, Kalantzis, McCarthey, Vojak, & Kline, 2011; Peden & Carroll, 2008). While rubrics can provide a systematic means to evaluate student writing, their reliability and validity can be questionable. This is exemplified by recent studies, which reveal that rater bias and invalidity of writing assessments are negatively impac- ting summative student evaluation (Graham, Hebert, & Harris, 2011: p. 10; Johnson & VanBrackle, 2012). To overcome these shortcomings, educators have advocated the use of more au- thentic assessment methods such as self-assessment checklists, writing conferences, and writing portfolios (Schulz, 2009). Current problems with reliability and validity of the writing rubric may be caused by the subjectivity of rubric criteria. As pointed out by Fang and Wang (2011), such criteria contain ex- pressions such as “exceptionally clear”, “effectively organized”, “carefully chosen”, and “strong control”, which force teachers to “rely on their own intuition and discursive knowledge in mak- ing judgment calls” (Fang & Wang, 2011: p. 148). In reality, this use of vague, subjective descriptors for different categories of writing reflect a deficiency in understanding of what consti- tutes good writing. Exploration of more objective, empirical measures of writing quality may improve this understanding, thereby allowing for the development of more effective evalua- tion techniques (Sévigny, Savard, & Beaudoin, 2009). The pur- pose of this study, therefore, is to examine multiple empirical criteria and their influence on overall writing quality. Disparities between Writing Rubrics Many educators have attempted to increase the validity and reliability of writing evaluation through the development of ru- brics. Although they are a useful step forward, key limitations remain. One of the largest problems with such rubrics is the subjectivity and ambiguity of language they contain. Holistic rubrics, for example, which rely upon general impressions of quality based upon descriptors contained within each proficien- cy level, often contain vague language which masks the signifi- cance of results and lessens the potential for washback (Brown, 2004). Consider the following examples contained within levels 4 and 5 of the Test of English as a Foreign Language (TOEFL) rubric for academic writing (Educational Testing Service, 2008): Criteria for Rubric Level 4 1) Addresses the topic and task well, though some points may not be fully elaborated. 2) Is generally well organized and well developed, using ap- propriate and sufficient explanations, exemplifications, and/or details. 3) Displays unity, progression, and coherence, though it may contain occasional redundancy, digression, or unclear connec- tions. * Corresponding author.