An Investigation into Keystroke Latency Metrics as an Indicator of Programming Performance Richard C. Thomas School of Computer Science & Software Engineering, M002 The University of Western Australia 35 Stirling Hwy, Crawley 6009, Western Australia richard@csse.uwa.edu.au Amela Karahasanovic Simula Research Laboratory PO Box 134, 1325 Lysaker, Norway amela@simula.no Gregor E. Kennedy Biomedical Multimedia Unit The University of Melbourne Parkville 3010, Vic, Australia gek@unimelb.edu.au Abstract Typing has long been studied in psychology and HCI, and strong cognitive models for transcription typing exist. The goal of the present research was to test if there is any correlation between students’ keystroking speed and performance while they are programming. We present the results from two studies with computer science students conducted in different contexts. Keystroke timings were recorded while they worked on Java and Ada source code. Quality of their programming work was measured mainly in terms of completeness. In the controlled experiment that lasted six hours, 39 students undertook three change tasks on a 6000 LOC Java application. In the field study, data was collected over 6 weeks from 141 students while they worked unsupervised on Ada programming in first year laboratories. In both cases there were highly significant (P=0.001), moderately strong, negative correlations between speed and coding performance. With additional development, these techniques may have promise for user modelling and assessment as well as in educational diagnostics . . Keywords: Digraph latencies, empirical methods, programming performance, keystroke model, chunking. 1 Introduction One of the greatest challenges facing teachers of introductory computer science units is the high rate of attrition. Sometimes 25% or even 40% of enrolled students do not succeed. Associated with this is a desire to be able to spot the potential problem students as early as possible. Timely warning can aid the teacher to provide remediation, at least in a well-resourced world. The student too would be able to make more informed choices when presented with evidence that progress is not Copyright (c) 2005, Australian Computer Society, Inc. This paper appeared at the Australasian Computing Education Conference 2005, Newcastle, Australia. Conferences in Research and Practice in Information Technology, Vol. 42. Alison Young and Denise Tolhurst, Eds. Reproduction for academic, not-for profit purposes permitted provided this text is included. encouraging, perhaps prompting a change of study methods or to review wider choices before it is too late. In this paper we present work in progress on the use of low level, continuous keystroke monitoring as a means to identify potential problems early on. In particular we investigate whether the latency, or delay, between certain keystrokes correlates with objective measures of programming performance. We have checked this for students in two countries: one in a controlled experiment developing Java code for a few hours; the other using first year laboratories over a few weeks, where Ada was the programming language. Keystrokes are well understood. There is a strong cognitive theory of typing, reviewed below. Furthermore keystroke latencies can be applied to user modelling. For instance their potential as a means for user authentication has been investigated (Joyce and Gupta, 1990; Monrose and Rubin 2000). Latencies have shown promise elsewhere. In the LISTEN project to teach children to read aloud, Beck et al (2003) found that latency before saying a word had promise as a feature for assessment of reading achievement. 2 Models of Typing Typing has been studied in psychology, cognitive science and human-computer interaction. Although not mainstream, there is a solid body of knowledge on which to base the present investigation. Newell (1990) gives a very clear account of the process of transcription, or copy, typing. Typing is a pipeline process, basically: perceive a chunk, determine the spelling, obtain a letter, and execute a keystroke. The pipelining effect allows, for example, reading a word to happen in parallel with pressing the key for some previously identified letter, and one hand can operate in parallel with the other. It has been shown that SOAR cognitive models accurately reflect some observed typing phenomena. John and Newell (1989) give some detail of the perception of a chunk in these models. It could be a word or syllable or a single character depending upon circumstances, such as whether it contains random letters or is partially covered up. From psychology there are several phenomena (Salthouse 1986) that inform the present investigations. Foremost for