-1- Kolmororov Smirnov Z LST DST ADST MVT SMT Sum of ranks Z Rank Method A 7 5 5 7 7 31 7 Method B 5 6 6 6 6 29 6 Method C 6 7 7 3 5 28 5 Method D 1 2 1 1 1 6 1 Method E 2 1 2 2 2 9 2 Method F 3 3 3 4 4 17 3 Method G 4 4 4 5 3 20 4 -2- Skew of distribution LST DST ADST MVT SMT Sum of ranks Skew Rank Method A 7 7 3 7 7 31 7 Method B 1 5 6 3 3 18 3 Method C 3 1 4 1 1 10 1 Method D 6 3 2 4 5 20 5 Method E 4 2 1 6 2 15 2 Method F 2 4 7 2 4 19 4 Method G 5 6 5 5 6 27 6 -3- Kurtosis of distribution LST DST ADST MVT SMT Sum of ranks Kurtosis Rank Method A 1 7 5 7 5 25 7 Method B 6 3 2 3 7 21 4 Method C 7 4 7 6 1 25 6 Method D 2 2 4 4 2 14 1 Method E 5 1 3 3 3 15 3 Method F 3 6 6 2 6 23 7 Method G 4 5 1 1 4 15 3 -4- Correlation with Raven LST DST ADST MVT SMT Sum of ranks Raven Rank Method A 7 5 6 7 4 29 7 Method B 5 7 2 5 7 26 5 Method C 6 4 7 4 6 27 6 Method D 3 2 4 1 2 12 2 Method E 2 1 5 2 1 11 1 Method F 4 6 1 3 5 19 4 Method G 1 3 3 6 3 16 3 All items were administered to all the participants 3 items for each list length INTRODUCTION METHOD CONCLUSION RESULTS Contact: Philippe.Golay@unige.ch 11th Congress of the Swiss Psychological Society, Neuchâtel, Switzerland, 19-20 September, 2009 Comparing scoring methods in five visuospatial working memory tasks from a distributional perspective Philippe Golay * & Thierry Lecerf * , # Faculty of Psychology and Educational Sciences, University of Geneva, Switzerland *, University of Lausanne # , Switzerland Several scoring methods for Working Memory (WM) span tasks exist but little effort has been devoted to systematic comparison, except one within the verbal domain (Friedman & Miyake, 2005); No comparison were made within the visuospatial domain; Unsworth & Engle (2007) showed that scoring methods have a non- negligible influence on the predictive utility of WM span tasks. SCORING METHODS (& SCORES FROM ABOVE EXAMPLE) 160 young adults (psychology students; 95 females) Age range : 19-35 years (M = 24.38; SD = 2.89) SAMPLE DESCRIPTION OBJECTIVES A. Highest level with at least one item correctly recalled : score 5 B. Highest level with more than 50% of items correctly recalled : score 4 C. Highest level with all items correctly recalled : score 3 D. “Proportion score” (Freidman & Miyake, 2005) : score 0.84 Recall of 4 out of 5 elements on a trial => score .8; proportions for all items were then averaged; E.“Total score” (Freidman & Miyake, 2005) : score 61 Total number of elements correctly recalled across all trials (recall of 3 out of 5 elements on a trial = 3 points); F.Linear regression estimation (adapted from Robertson, 2006) : score 4.50 Linear regression was used to estimate the list length at which probability of a perfectly correct response was .50; G.“Ponderation” score (Logie & Pearson, 1997) : score 4.33 Mean of the level of the three most complex items successfully recalled Compare seven scoring methods from a distributional perspective (normality, kurtosis & skewness); Correlate each score with a fluid intelligence measure to determine the best scoring method. SPAN PROCEDURE: ASCENDING FLUID INTELLIGENCE TASK V V V V X X X V V V V V V X V V V V V X X V V V V X X V V V V X X V V V V V X V V V V X V V V V X V V V V V V V V V V V V X V V V V V V V V V V V V V Example : 15 items, from level 3 to 7 A B C D E F G RAVEN A 1 B 0.61 1 C 0.39 0.65 1 D 0.72 0.83 0.74 1 E 0.74 0.83 0.72 0.99 1 F 0.73 0.94 0.74 0.90 0.90 1 G 0.87 0.89 0.64 0.88 0.89 0.94 1 RAVEN 0.42 0.47 0.42 0.51 0.52 0.50 0.54 1 Method A is the worst method according to all criteria Methods D & E are the best methods : Normality (normal distribution for 4 span tasks) Higher correlation with the criterion measure (Raven) When all 4 criteria were jointly taken into account Scoring method Sum (mean) of ranks Result Median of ranks Result Z + Skew + Kurtosis + Raven Result A 116 (5.8) 6th 7 5th 26 5th B 94 (4.7) 5th 5 4th 17 4th C 90 (4.5) 4th 4.5 3rd 17 4th D 52 (2.6) 2nd 2 1st 9 2nd E 50 (2.5) 1st 2 1st 7 1st F 78 (3.9) 3rd 4 2nd 15 3rd G 78 (3.9) 3rd 4 2nd 15 3rd 1) Normality (Kolmogorov-Smirnof Z) the less the better 2) Skew of distribution the less the better 3) Kurtosis of distribution the less the better 4) Correlation with Raven’s Advanced Progressive Matrices the more the better REFERENCES Friedman, N. P., Miyake, A., Friedman, N. P., & Miyake, A. (2005). Comparison of four scoring methods for the reading span test. Behavior Research Methods, 37(4), 581-590. Lecerf, T., Ghisletta, P., & Jouffray, C. (2004). Intraindividual variability and level of performance in four visuo-spatial working memory tasks. Swiss Journal of Psychology, 63(4), 261-272. Robertson, S., Myerson, J., Hale, S., Robertson, S., Myerson, J., & Hale, S. (2006). Are there age differences in intraindividual variability in working memory performance? Journals of Gerontology Series B-Psychological Sciences & Social Sciences, 61(1), P18-24. Unsworth, N., Engle, R. W., Unsworth, N., & Engle, R. W. (2007). The nature of individual differences in working memory capacity: active maintenance in primary memory and controlled search from secondary memory. Psychological Review, 114(1), 104-132. VISUOSPATIAL WORKING MEMORY SPAN TASKS Raven Advanced Progressive Matrices (APM) Series 2 (36 items) Time limit : 40 minutes Score: number of correct items 1. Location Span Test (LST): Recall of position of cells that contained an arrow (1) 2. Direction Span Test (DST) : Recall of position of cells pointed by each arrow (2) 3. Anti Direction span test (ADST): Recall of position of cells which are opposite to the direction pointed by an arrow (3) In each of these 3 tasks : Sequential presentation (1-s per arrow) 3 items for each list length (3 to 7, and 2 to 7 in the ADST) 3 1 2 3 3 1 1 2 2 5. Selective Matrix Task (SMT; Cornoldi, 2002) Sequential presentation of series of 3 dots (1-s per dot) Determination whether 3 dots were aligned or not (yes/no) Recall of the position of the last cell of each series 3 items for each list length (2 to 6) 4. Matrix Visual Task (MVT; Vecchi , 2002) Recall of position of white cells only Simultaneous presentation (1-s x number of white cells) 3 items for each list length (3, 5, 7 & 9) [Yes] [No] The seven scoring methods were ranked in regard to 4 criteria : Summary of results : Correlations between scoring methods for the Location Span Test (N = 160) (all correlations statistically significant at level .01) V X correctly recalled wrong/not recalled Measures which include additional information from partially correctly recalled sets (D - “Proportion score” &E – “Total score”) show more adequate distribution characteristics and present higher correlation with fluid intelligence; Scores reflecting the size of the span (methods A, B & C) seem appropriate at first glance, but provide less inter-individual discrimination; Linear regression estimation suffers from frequent violations of the assumption of increasing difficulty and may lead to absurd values without careful screening. (% of correctly recalled positions does not always monotonically decreases as difficulty increases) View publication stats View publication stats