Measurement of perception and interpretation skills during radiology training: utility of the script concordance approach LUCIE BRAZEAU-LAMONTAGNE 1 , BERNARD CHARLIN 2 , ROBERT GAGNON 2 , LOUISE SAMSON 2 & CEES VAN DER VLEUTEN 3 1 University of Sherbrooke, Canada; 2 University of Montreal; Canada; 3 Maastricht University, The Netherlands SUMMARY Imaging specialties require both perceptual and interpretation skills. Except in very simple cases, data perception and interpretation vary among clinicians. This variability makes for difficulty in measuring these skills with traditional assessment tools. The script concordance approach is conceived to allow standardized assessment in contexts of uncertainty. In this exploratory study, the authors tested the usefulness of the approach for assessment of perceptual and interpretation skills in radiology. A perception test (PT) and an interpretation test (IT) were designed according to the approach. Both tests used plain chest X-rays. Three groups were tested: clerkship students (20), junior residents (R1–R3; 20), senior residents (R4–R5; 20). Eleven certified radiologists, all currently appointed to chest reading, provided the answers by aggregate scoring method. Statistics included descriptive, ANOVA, regression analysis, Pearson and Spearman correlation coefficients. Cronbach alpha values were 0.79 and 0.81 for the PT and IT respectively. Score progression was statistically significant in both tests. Perception scores progressed more rapidly than interpretation scores during training. Effect size was large in discriminating low versus higher level of expertise, 2.2 (PT) and 1.6 (IT). The Pearson correlation coefficient between both tests was 0.58. Cronbach alpha coefficient values indicate reasonable reliability for both tests. The linear progression of scores, each at its own pace, and the positive and moderate magnitude of the Pearson correlation coefficient are arguments suggesting measurement of two different skills. More studies are necessary to document the approach usefulness for assessment in radiology training. Introduction Visual clinical specialties require both perceptual skills, which are mostly non-analytic, and interpretation skills, which look for clues and make a series of value judgments in order to arrive at a diagnosis (Norman et al., 1992). Experience shows that residents’ perceptual and interpretation skills do not necessarily develop synchronously. Knowing what to look for does not guarantee against ‘creative reading’, interpreting composite shadows for real nodules, for instance. Perception–interpretation discrepancies are common diffi- culties encountered in training residents in radiology. So far, such discrepancies remain resistant to objective demonstra- tion and there is a need in radiology training programs for tests that can document the progress of students and residents in both skills. One reason for the difficulty in achieving reliable tests of reading skills might be the variability that expert radiologists demonstrate when perceiving and interpreting diagnostic images. Research on clinical reasoning has demonstrated that, in similar clinical settings, physicians do not collect the exact same data and do not follow the same path of thought, even if they come to the same diagnosis (Grant & Marsden, 1988). Moreover, physicians perform with substantial variation on any specific real or simulated case (Barrows et al., 1978; Elstein et al., 1978). Among experts, unanimous reasoning on real clinical situations is the exception. Divergent opinion among them is rather the rule, even if they generally agree on the outcome, for instance the diagnosis. When translated into assessment settings, this implies that test answer grids cannot be (and most of the time are not) based on a single examiner (Swanson et al., 1987). The script concordance approach (Charlin et al., 2000a) offers a way to overcome these difficulties. It rests on three principles, each of them concerning one of the three components (Norman et al., 1996) of all tests: the task required from examinees, the way examinees answers are recorded, and the way examinees’ performance is trans- formed into a score. The task presented to the candidates is challenging. It represents a real clinical situation usually described in a vignette (Charlin et al., 2000a). Several options (diagnosis, management or attitude) are relevant to the situation and items are made with the questions experts ask themselves to progress toward a solution. For a test on interpretation in radiology, the task is based on an authentic set of images, presenting a genuine diagnostic challenge, even for an expert. Items ask how a specific sign, present ( positive sign) or absent (negative sign), affects one of the hypotheses relevant to the situation. Items have three parts. The first presents the hypothesis. The second presents a sign (positive or negative) that may have an effect on the hypothesis. The third part, a Likert scale, captures examinees’ answers. This response format is in accordance with what is known from clinical reasoning processes (Barrows et al., 1978; Elstein et al., 1978; Grant & Marsden, 1988). It allows for the measurement of the judgments that are constantly made within this process Correspondence: Bernard Charlin, URDESS, Faculte ´ de Me ´decine-Direction, Universite ´ de Montre ´al, C.P. 6128, succursale centre-ville, Montre ´al, Que ´bec, H3C 3J7 Canada. Tel: 514 343 7827; fax 514 343 7650; email: bernard.charlin@umontreal.ca Medical Teacher, Vol. 26, No. 4, 2004, pp. 326–332 326 ISSN 0142–159X print/ISSN 1466–187X online/03/030326-7 ß 2004 Taylor & Francis Ltd DOI: 10.1080/01421590410001679000