-1- Kolmororov Smirnov Z LST DST ADST MVT SMT Sum of ranks Z Rank Method A 7 5 5 7 7 31 7 Method B 5 6 6 6 6 29 6 Method C 6 7 7 3 5 28 5 Method D 1 2 1 1 1 6 1 Method E 2 1 2 2 2 9 2 Method F 3 3 3 4 4 17 3 Method G 4 4 4 5 3 20 4 -2- Skew of distribution LST DST ADST MVT SMT Sum of ranks Skew Rank Method A 7 7 3 7 7 31 7 Method B 1 5 6 3 3 18 3 Method C 3 1 4 1 1 10 1 Method D 6 3 2 4 5 20 5 Method E 4 2 1 6 2 15 2 Method F 2 4 7 2 4 19 4 Method G 5 6 5 5 6 27 6 -3- Kurtosis of distribution LST DST ADST MVT SMT Sum of ranks Kurtosis Rank Method A 1 7 5 7 5 25 7 Method B 6 3 2 3 7 21 4 Method C 7 4 7 6 1 25 6 Method D 2 2 4 4 2 14 1 Method E 5 1 3 3 3 15 3 Method F 3 6 6 2 6 23 7 Method G 4 5 1 1 4 15 3 -4- Correlation with Raven LST DST ADST MVT SMT Sum of ranks Raven Rank Method A 7 5 6 7 4 29 7 Method B 5 7 2 5 7 26 5 Method C 6 4 7 4 6 27 6 Method D 3 2 4 1 2 12 2 Method E 2 1 5 2 1 11 1 Method F 4 6 1 3 5 19 4 Method G 1 3 3 6 3 16 3 • All items were administered to all the participants • 3 items for each list length INTRODUCTION METHOD CONCLUSION RESULTS Contact: Philippe.Golay@unige.ch 11th Congress of the Swiss Psychological Society, Neuchâtel, Switzerland, 19-20 September, 2009 Comparing scoring methods in five visuospatial working memory tasks from a distributional perspective Philippe Golay * & Thierry Lecerf * , # Faculty of Psychology and Educational Sciences, University of Geneva, Switzerland *, University of Lausanne # , Switzerland • Several scoring methods for Working Memory (WM) span tasks exist but little effort has been devoted to systematic comparison, except one within the verbal domain (Friedman & Miyake, 2005); • No comparison were made within the visuospatial domain; • Unsworth & Engle (2007) showed that scoring methods have a non- negligible influence on the predictive utility of WM span tasks. SCORING METHODS (& SCORES FROM ABOVE EXAMPLE) 160 young adults (psychology students; 95 females) Age range : 19-35 years (M = 24.38; SD = 2.89) SAMPLE DESCRIPTION OBJECTIVES A. Highest level with at least one item correctly recalled : score 5 B. Highest level with more than 50% of items correctly recalled : score 4 C. Highest level with all items correctly recalled : score 3 D. “Proportion score” (Freidman & Miyake, 2005) : score 0.84 • Recall of 4 out of 5 elements on a trial => score .8; proportions for all items were then averaged; E.“Total score” (Freidman & Miyake, 2005) : score 61 • Total number of elements correctly recalled across all trials (recall of 3 out of 5 elements on a trial = 3 points); F.Linear regression estimation (adapted from Robertson, 2006) : score 4.50 • Linear regression was used to estimate the list length at which probability of a perfectly correct response was .50; G.“Ponderation” score (Logie & Pearson, 1997) : score 4.33 • Mean of the level of the three most complex items successfully recalled • Compare seven scoring methods from a distributional perspective (normality, kurtosis & skewness); • Correlate each score with a fluid intelligence measure to determine the best scoring method. SPAN PROCEDURE: ASCENDING FLUID INTELLIGENCE TASK V V V V X X X V V V V V V X V V V V V X X V V V V X X V V V V X X V V V V V X V V V V X V V V V X V V V V V V V V V V V V X V V V V V V V V V V V V V Example : 15 items, from level 3 to 7 A B C D E F G RAVEN A 1 B 0.61 1 C 0.39 0.65 1 D 0.72 0.83 0.74 1 E 0.74 0.83 0.72 0.99 1 F 0.73 0.94 0.74 0.90 0.90 1 G 0.87 0.89 0.64 0.88 0.89 0.94 1 RAVEN 0.42 0.47 0.42 0.51 0.52 0.50 0.54 1 Method A is the worst method according to all criteria Methods D & E are the best methods : • Normality (normal distribution for 4 span tasks) • Higher correlation with the criterion measure (Raven) • When all 4 criteria were jointly taken into account Scoring method Sum (mean) of ranks Result Median of ranks Result Z + Skew + Kurtosis + Raven Result A 116 (5.8) 6th 7 5th 26 5th B 94 (4.7) 5th 5 4th 17 4th C 90 (4.5) 4th 4.5 3rd 17 4th D 52 (2.6) 2nd 2 1st 9 2nd E 50 (2.5) 1st 2 1st 7 1st F 78 (3.9) 3rd 4 2nd 15 3rd G 78 (3.9) 3rd 4 2nd 15 3rd 1) Normality (Kolmogorov-Smirnof Z) – the less the better 2) Skew of distribution – the less the better 3) Kurtosis of distribution – the less the better 4) Correlation with Raven’s Advanced Progressive Matrices – the more the better REFERENCES • Friedman, N. P., Miyake, A., Friedman, N. P., & Miyake, A. (2005). Comparison of four scoring methods for the reading span test. Behavior Research Methods, 37(4), 581-590. • Lecerf, T., Ghisletta, P., & Jouffray, C. (2004). Intraindividual variability and level of performance in four visuo-spatial working memory tasks. Swiss Journal of Psychology, 63(4), 261-272. • Robertson, S., Myerson, J., Hale, S., Robertson, S., Myerson, J., & Hale, S. (2006). Are there age differences in intraindividual variability in working memory performance? Journals of Gerontology Series B-Psychological Sciences & Social Sciences, 61(1), P18-24. • Unsworth, N., Engle, R. W., Unsworth, N., & Engle, R. W. (2007). The nature of individual differences in working memory capacity: active maintenance in primary memory and controlled search from secondary memory. Psychological Review, 114(1), 104-132. VISUOSPATIAL WORKING MEMORY SPAN TASKS • Raven Advanced Progressive Matrices (APM) • Series 2 (36 items) • Time limit : 40 minutes Score: number of correct items 1. Location Span Test (LST): Recall of position of cells that contained an arrow (1) 2. Direction Span Test (DST) : Recall of position of cells pointed by each arrow (2) 3. Anti Direction span test (ADST): Recall of position of cells which are opposite to the direction pointed by an arrow (3) In each of these 3 tasks : • Sequential presentation (1-s per arrow) • 3 items for each list length (3 to 7, and 2 to 7 in the ADST) 3 1 2 3 3 1 1 2 2 5. Selective Matrix Task (SMT; Cornoldi, 2002) • Sequential presentation of series of 3 dots (1-s per dot) • Determination whether 3 dots were aligned or not (yes/no) • Recall of the position of the last cell of each series • 3 items for each list length (2 to 6) 4. Matrix Visual Task (MVT; Vecchi , 2002) • Recall of position of white cells only • Simultaneous presentation (1-s x number of white cells) • 3 items for each list length (3, 5, 7 & 9) [Yes] [No] The seven scoring methods were ranked in regard to 4 criteria : Summary of results : Correlations between scoring methods for the Location Span Test (N = 160) (all correlations statistically significant at level .01) • • • • • • V X correctly recalled wrong/not recalled • Measures which include additional information from partially correctly recalled sets (D - “Proportion score” &E – “Total score”) show more adequate distribution characteristics and present higher correlation with fluid intelligence; • Scores reflecting the size of the span (methods A, B & C) seem appropriate at first glance, but provide less inter-individual discrimination; • Linear regression estimation suffers from frequent violations of the assumption of increasing difficulty and may lead to absurd values without careful screening. (% of correctly recalled positions does not always monotonically decreases as difficulty increases) View publication stats View publication stats