Truchrnp & Truchrr Edumon. Vol. 7. No 4. pp. 303-314. 1991 0742451x.‘91 13.M)+o.O0 Pnnrd tn Great Braun Pcrgamon Press plc zyxwvutsrq STUDENTS’ EVALUATIONS OF TEACHING EFFECTIVENESS: THE STABILITY OF MEAN RATINGS OF THE SAME TEACHERS OVER A 13-YEAR PERIOD HERBERT W. MARSH University of Western Sydney, Macarthur, Australia and DENNIS HOCEVAR University of Southern California, U.S.A Abstract--Students’ evaluations of teaching effectiveness (SETEs) are weakly related- negatively-to teaching experience and age according to Feldman’s (1983) comprehensive review of cross-sectional studies. Cross-sectional studies, however. provide a weak basis for inferring the future ratings of less experienced teachers or the past ratings of more experienced teachers. Considered here are ratings of 6024 classes taught by a diverse cohort of 195 teachers represent- ing 31 academic departments who were evaluated continuously over a l3-year period using the same multidimensional Students’ Evaluations of Educational Quality instrument. For both undergraduate and graduate level courses, there were almost no changes over time for any of the nine content-specific dimensions, the overall course rating, or the overall instructor rating. The findings were consistent for teachers who had little, moderate, or substantial amounts of teaching experience at the start of the study. These results are important because this is apparently the only study to examine the stability of faculty ratings using a longitudinal design with a large and diverse group of teachers over such a long period of time. Students’ evaluations of teaching effectiveness (SETEs) are widely collected and used for a variety of purposes such as personnel decisions, feedback to faculty on the effectiveness of their teaching, input into students’ course selection, and research on teaching. An enormous amount of research has demonstrated that SETEs are multidimensional with a well-defined factor structure, internally consistent, reasonably valid when compared to a variety of other in- dicators of effective teaching, and relatively unaffected by potential biases to the ratings (see Marsh, 1987, for an overview of this research). Nevertheless, most of this research has con- sidered ratings collected in one specific course on a single occasion and there is surprisingly little research on the stability of mean ratings received by the same instructor over an ex- tended period of time. The purpose of the pre- sent investigation is to examine changes in ratings of a large number of teachers who have been evaluated continuously over a 13-year period with the same multidimensional Students’ Evaluations of Educational Quality (SEEQ) instrument. The Stability of Students’ Evaluations of Teaching There are many approaches to the study of stability and change (Goldstein, 1979; Plewis, 1985; Rogosa, 1979; Rogosa, Floden, & Willett, 1984; Willett, 1988). The two most common, however, refer to the stability of means over time (mean stability) and to the 303