Creative Education
2012. Vol.3, No.8, 1320-1325
Published Online December 2012 in SciRes (http://www.SciRP.org/journal/ce) http://dx.doi.org/10.4236/ce.2012.38193
Copyright © 2012 SciRes. 1320
Building a Better Mousetrap: Replacing Subjective Writing
Rubrics with More Empirically-Sound Alternatives for EFL
Learners
Andrew D. Schenck, Eoin Daly
*
English Education, Department of Liberal Arts Education (LAEC), Ju Si-Gyeong College, Pai Chai University,
Daejeon, South Korea
Email: Schenck@hotmail.com,
*
eointeacher@yahoo.com
Received October 2
nd
, 2012; revised November 5
th
, 2012; accepted November 9
th
, 2012
Although writing rubrics can provide valuable feedback, the criteria they use are often subjective, which
compels raters to employ their own tacit biases. The purpose of this study is to see if discreet empirical
characteristics of texts can be used in lieu of the rubric to objectively assess the writing quality of EFL
learners. The academic paragraphs of 38 participants were evaluated according to several empirically
calculable criteria related to cohesion, content, and grammar. Values were then compared to scores ob-
tained from holistic scoring by multiple raters using a multiple regression formula. The resulting correla-
tion between variables (R = .873) was highly significant, suggesting that more empirical, impartial means
of writing evaluation can now be used in conjunction with technology to provide student feedback and
teacher training.
Keywords: Writing Rubrics; Writing Evaluation; Cohesion; Grammar; Word Frequency
Introduction
Several studies recognize the efficacy of the rubric as a
means to score writing and provide feedback (Cope, Kalantzis,
McCarthey, Vojak, & Kline, 2011; Mansilla, Duraisingh, Wolfe,
& Haynes, 2009; Peden & Carroll, 2008). A study by Beyreli
and Ari (2009), for example, found that it could be accurately
used to assess ten properties related to structure, language, and
organization with a fair degree of inter-rater reliability (from
65% to 81%). Another study revealed that it could be used to
evaluate writing holistically, regardless of the participants’ L1
(Sévigny, Savard, & Beaudoin, 2009). Recent adaptations of
the rubric have even discovered the potential to increase forma-
tive feedback through the use of both technology and self-as-
sessment strategies (Cope, Kalantzis, McCarthey, Vojak, & Kline,
2011; Peden & Carroll, 2008).
While rubrics can provide a systematic means to evaluate
student writing, their reliability and validity can be questionable.
This is exemplified by recent studies, which reveal that rater
bias and invalidity of writing assessments are negatively impac-
ting summative student evaluation (Graham, Hebert, & Harris,
2011: p. 10; Johnson & VanBrackle, 2012). To overcome these
shortcomings, educators have advocated the use of more au-
thentic assessment methods such as self-assessment checklists,
writing conferences, and writing portfolios (Schulz, 2009).
Current problems with reliability and validity of the writing
rubric may be caused by the subjectivity of rubric criteria. As
pointed out by Fang and Wang (2011), such criteria contain ex-
pressions such as “exceptionally clear”, “effectively organized”,
“carefully chosen”, and “strong control”, which force teachers to
“rely on their own intuition and discursive knowledge in mak-
ing judgment calls” (Fang & Wang, 2011: p. 148). In reality,
this use of vague, subjective descriptors for different categories
of writing reflect a deficiency in understanding of what consti-
tutes good writing. Exploration of more objective, empirical
measures of writing quality may improve this understanding,
thereby allowing for the development of more effective evalua-
tion techniques (Sévigny, Savard, & Beaudoin, 2009). The pur-
pose of this study, therefore, is to examine multiple empirical
criteria and their influence on overall writing quality.
Disparities between Writing Rubrics
Many educators have attempted to increase the validity and
reliability of writing evaluation through the development of ru-
brics. Although they are a useful step forward, key limitations
remain. One of the largest problems with such rubrics is the
subjectivity and ambiguity of language they contain. Holistic
rubrics, for example, which rely upon general impressions of
quality based upon descriptors contained within each proficien-
cy level, often contain vague language which masks the signifi-
cance of results and lessens the potential for washback (Brown,
2004). Consider the following examples contained within levels
4 and 5 of the Test of English as a Foreign Language (TOEFL)
rubric for academic writing (Educational Testing Service, 2008):
Criteria for Rubric Level 4
1) Addresses the topic and task well, though some points
may not be fully elaborated.
2) Is generally well organized and well developed, using ap-
propriate and sufficient explanations, exemplifications, and/or
details.
3) Displays unity, progression, and coherence, though it may
contain occasional redundancy, digression, or unclear connec-
tions.
*
Corresponding author.