RESEARCH REPORT
Comparative Evaluation of Three Situational Judgment Test Response
Formats in Terms of Construct-Related Validity, Subgroup Differences,
and Susceptibility to Response Distortion
Winfred Arthur Jr., Ryan M. Glaze,
Steven M. Jarrett, Craig D. White, and Ira Schurig
Texas A&M University
Jason E. Taylor
People Answers, Inc., Dallas, Texas
As a testing method, the efficacy of situational judgment tests (SJTs) is a function of a number of
design features. One such design feature is the response format. However, despite the considerable
interest in SJT design features, there is little guidance in the extant literature as to which response
format is superior or the conditions under which one might be preferable to others. Using an
integrity-based SJT measure administered to 31,194 job applicants, we present a comparative
evaluation of 3 response formats (rate, rank, and most/least) in terms of construct-related validity,
subgroup differences, and score reliability. The results indicate that the rate-SJT displayed stronger
correlations with the hypothesized personality traits; weaker correlations with general mental ability
and, consequently, lower levels of subgroup differences; and higher levels of internal consistency
reliability. A follow-up study with 492 college students (Study 2; details of which are presented in
the online supplemental materials) also indicates that the rate response format displayed higher
levels of internal consistency and retest reliability as well as favorable reactions from test takers.
However, it displayed the strongest relationships with a measure of response distortion, suggesting
that it is more susceptible to this threat. Although there were a few exceptions, the rank and
most/least response formats were generally quite similar in terms of several of the study outcomes.
The results suggest that in the context of SJTs designed to measure noncognitive constructs, the
rate response format appears to be the superior, preferred response format, with its main drawback
being that it is susceptible to response distortion, although not any more so than the rank response
format.
Keywords: situational judgment tests, response formats, subgroup differences, test taker reactions,
noncognitive constructs
Supplemental materials: http://dx.doi.org/10.1037/a0035788.supp
As a predictor method, situational judgment tests (SJTs) are
conceptualized as low-fidelity simulations where test takers are
presented with work-related situations and a set of predetermined
responses (Motowidlo, Dunnette, & Carter, 1990). Although con-
struct explication in the SJT research literature continues to be
poor (see Arthur & Villado, 2008; Christian, Edwards, & Bradley,
2010; Schmitt & Chan, 2006), SJTs can be designed to measure a
number of constructs (e.g., job knowledge, interpersonal skills,
teamwork, leadership, conscientiousness, agreeableness, emo-
tional stability; Christian et al., 2010). Furthermore, like any other
method (e.g., assessment centers, interviews), there is a clear
recognition that the efficacy of SJTs is influenced by their design
features. As such, a number of these features have been investi-
gated, including the modes of presentation and level of fidelity
(e.g., written, verbal, video-based, or computer-based; Chan &
Schmitt, 2002; Clevenger, Pereira, Wiechmann, Schmitt, & Har-
vey, 2001; Olson-Buchanan et al., 1998; Weekley & Jones, 1997),
response instructions (i.e., behavioral- vs. knowledge-based; Mc-
Daniel, Whetzel, Hartman, Nguyen, & Grubb, 2006; Ployhart &
Ehrhart, 2002), scoring method (Bergman, Drasgow, Donovan,
Henning, & Juraska, 2006), and stem complexity (Ployhart &
MacKenzie, 2011). Ployhart and MacKenzie (2011) have pre-
sented an informative review and description of these design
features.
This article was published Online First February 3, 2014.
Winfred Arthur Jr., Ryan M. Glaze, Steven M. Jarrett, Craig D. White,
and Ira Schurig, Department of Psychology, Texas A&M University; Jason
E. Taylor, People Answers, Inc., Dallas, Texas.
This research was partially funded by an award from the Texas A&M
College of Liberal Arts Cornerstone Faculty Fellowship awarded to Win-
fred Arthur Jr.
Correspondence concerning this article should be addressed to Winfred
Arthur Jr., Department of Psychology, Texas A&M University, 4235
TAMU, College Station, TX 77843-4235. E-mail: w-arthur@tamu.edu
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Journal of Applied Psychology © 2014 American Psychological Association
2014, Vol. 99, No. 3, 535–545 0021-9010/14/$12.00 DOI: 10.1037/a0035788
535