Journal of College Teaching & Learning – August 2008 Volume 5, Number 8 39 An Investigation Of 'Honesty Check' Items In Higher Education Course Evaluations Kelly D. Bradley, University of Kentucky, USA Kenneth D. Royal, University of Kentucky, USA James W. Bradley, Bluegrass Community and Technical College, USA ABSTRACT The reliability and validity of course evaluations in higher education is often assumed. The typical Likert-type surveys utilized when students' evaluate the course and instructor often overlook measurement issues, or deal with them in an ineffective manner. Given the importance that is placed on higher education course evaluations, with results impacting such events as merit raises and promotion, the proper construction and use of evaluation tools is a critical issue. In an effort to assure 'honesty' in student responses, many institutions include items written positively and negatively, which are intended to measure the same construct. Using 537 course evaluations for a mathematics faculty member at a Midwest college, an item analysis is conducted with attention given to means and standard deviations, frequency counts, nonparametric correlations and tests of significant differences between questions that should, in theory, produce a similar measure or exactly opposite. A contention is made that the way the item is asked does matter, at least in some instances, and it should not be assumed that an item written in the positive and negative should directly correlate. The survey research community and institutions utilizing similar rating scale instruments will benefit from the results of this study, as well as the education community in general. Keywords: Measurement, Course Evaluations, Reliability and Validity, Higher Education INTRODUCTION he practice of using reverse or negatively worded items on higher education course evaluations, and surveys in general, has been utilized for decades. The primary purpose of including this type of item is to ensure valid measures by safeguarding against acquiescence; basically, it is an “honesty-check” with the goal of identifying respondents who appear to select items haphazardly. Presumably, with proper identification of these respondents, institutions could remove these individuals, or make necessary adjustments prior to data analysis that would lead to valid results. This process, in of itself, can have a significant effect upon the reliability and validity of the evaluation process. Interestingly, the search for best practices in regards to the use of positively versus negatively worded items in the survey research literature is often muddled and contradictory. THEORETICAL FRAMEWORK Reliability And Validity Falthzik and Jolson‟s 1974 study lays a foundation for discussion. The authors created two surveys on different topics. Each survey contained six positive statements and six negative statements, which were constructed as the inverse of the other. Results suggested the intensity of responses depended on whether the wording was positive or negative. The authors also found when a personalized direction of a question is changed to a non- personalized direction, suggestibility is decreased. Falthzik and Jolson went on to argue the historical predominance of positive wording on Likert-type surveys may potentially lead respondents to answer differently than they would if questions were worded with the occasional negative statement, all else being the same. T