DOI: 10.1111/jedm.12221 Journal of Educational Measurement Fall 2019, Vol. 56, No. 3, pp. 475–477 Introduction to the Special Issue on Rater-Mediated Assessments George Engelhard, Jr. University of Georgia Stefanie A. Wind University of Alabama Rater-mediated assessments are assessments in which raters evaluate test-taker performances and use rating scale categories to describe the level of performance on one or more domains. In many cases, rubrics and performance-level descriptors provide guidance for raters’ judgmental processes as they use rating-scale categories to describe the level of test-taker performances. Even when automated scoring pro- cedures are used, human judgments guide the development and evaluation of the algorithms that ultimately score test-taker responses. Researchers and practitioners worldwide use rater-mediated assessments to evaluate test-taker performances across a variety of settings and content areas, including educational performance assessments (e.g., writing or music assessments), language proficiency assessments, and personnel evaluation, among others. In general, individuals who use rater-mediated assessments do so because they believe that they provide more relevant insight into test-taker locations on a particular construct compared to assessments that can be scored without rater judgments. However, because human judgment plays a central role in the scoring procedures for rater-mediated assessments, additional considerations are warranted in evaluations of the reliability, validity, and fairness of these procedures. In particular, many researchers have expressed concerns with the susceptibility of rater-mediated assessments to idiosyncrasies in human judgment that could threaten their psychometric quality. In response, researchers have proposed a wide range of indicators for evaluating these procedures that provide information regarding rater consistency (e.g., rater agreement and reliability statistics), rater accuracy (e.g., raters’ alignment with expert ratings), systematic biases related to test-taker characteristics, idiosyncratic use of rating scales (e.g., central tendency), and random errors in judgment. Overview of the Special Issue In this special issue, we present seven articles that highlight a range of consid- erations for the development, implementation, and interpretation of rater-mediated assessments. Each of the articles includes empirical analyses that demonstrate different methodological approaches to evaluate raters’ judgments for evidence of psychometric quality. In addition to methodological differences, the articles reflect different theoretical perspectives on what types of evidence are central to the evaluation of rater judgments. We have loosely grouped the articles into three categories: (1) considerations of raters’ judgmental processes, (2) approaches that directly consider the nested data structures in rater-mediated assessments, and (3) c 2019 by the National Council on Measurement in Education 475