Separating Sensitivity From Response Bias: Implications of Comparisons of Yes–No and Forced-Choice Tests for Models and Measures of Recognition Memory Neal E. A. Kroll and Andrew P. Yonelinas University of California, Davis Ian G. Dobbins Harvard Medical School Christina M. Frederick University of California, Berkeley A fundamental challenge to psychological research is the measurement of cognitive processes uncon- taminated by response strategies resulting from different testing procedures. Test-free estimates of ability are vital when comparing the performance of different groups or different conditions. The current study applied several sets of measurement models to both forced-choice and yes–no recognition memory tests and concluded that the traditional signal-detection model resulted in distorted estimates of accuracy. Two-factor models were necessary to separate memory sensitivity from response bias. These models indicated that (a) memory accuracy did not differ across the tests and (b) the tests relied on the same underlying memory processes. The results illustrate the pitfalls of using a single-component model to measure accuracy in tasks that reflect 2 or more underlying processes. One consistent measurement problem faced by psychologists, regardless of subdiscipline, is the separation of discriminative abilities from response or decision strategies, which can seriously contaminate or distort estimates of those abilities. This widespread problem has led to the development of statistical decision models whose primary aim is to effectively remove the contribution of strategic response biases or guessing strategies from estimates of discriminative abilities, allowing the effective comparisons of dif- ferent observers, groups, neuropsychological populations, or ex- perimental conditions on a particular perceptual or cognitive dis- crimination skill. These scoring methods have become so commonplace that they are often not viewed as models of discrim- ination processes but rather as simple “corrections” for guessing. However, to the extent that these models are inappropriate for the cognitive processes being measured, serious errors can arise when comparing individuals, conditions, or groups that have adopted different decision strategies. The consequences of such errors could range from as minor as clouding the interpretation of a small experimental project to as serious as misinterpreting the quality of radiological services at different clinical institutions. One hallmark of a successful decision model is that it yields similar estimates of accuracy across different testing formats for a given observer. More specifically, if the model’s assumptions regarding the nature of the underlying information and the decision process applied to that information are viable, then it should not matter whether the observer is tested using the sequential presentation of targets and lures (yes–no procedure) or using a simultaneous array of items in which only one is the target (forced-choice procedure). In either case, the estimate of accuracy should be highly similar. In the present study, we took a closer look at this problem from within the domain of human recognition memory. In particular, we were interested in systematically contrasting forced-choice (FC) and yes–no (YN) recognition accuracy estimates within individu- als. This interest was based, in part, on a current controversy regarding the relative ease of the testing formats and the relative performance of memory-impaired individuals across the two test types. Below, we briefly discuss the current debate regarding recognition performance across the two test formats and describe the most commonly used index of accuracy for across-test com- parison, the signal-detection theory estimate, d'. Following this, we then show how d' yields consistently discordant estimates across the testing formats, present two modifications of this basic model that potentially eliminate the discrepancy, and discuss the relative merits and practical appeal of each. Our point is not purely methodological; on the contrary, our argument is that the method used to estimate sensitivity, independent of response biases, de- pends on the theoretical model of the processes underlying sensi- tivity. Through this article, we aim to shed new light on both the methodological and the theoretical issues. Signal-detection theory is currently being used in a myriad of applications, including visual detection, psychoacoustics, learning, Neal E. A. Kroll and Andrew P. Yonelinas, Department of Psychology, University of California, Davis; Ian G. Dobbins, Nuclear Magnetic Resonance Center, Massachusetts General Hospital, Harvard Medical School; Christina M. Frederick, Department of Psychology, University of California, Berkeley. This work was supported by National Institute of Mental Health Grant MH59352-01. We are grateful to Anne E. Bower for her assistance in data collection. Correspondence concerning this article should be addressed to Neal E. A. Kroll, Department of Psychology, University of California, One Shields Avenue, Davis, California 95616. E-mail: neakroll@ucdavis.edu Journal of Experimental Psychology: General Copyright 2002 by the American Psychological Association, Inc. 2002, Vol. 131, No. 2, 241–254 0096-3445/02/$5.00 DOI: 10.1037//0096-3445.131.2.241 241