Differences in Comprehensibility Between Three-Variable Bar and Line Graphs David Peebles (D.Peebles@hud.ac.uk) Nadia Ali (N.Ali@hud.ac.uk) Department of Behavioural and Social Sciences, University of Huddersfield, Queensgate, Huddersfield, HD1 3DH, UK. We report an experiment investigating graph comprehension. Verbal protocol data were collected while participants at- tempted to understand six bar or line graphs representing re- lationships between three variables. Analysis of the verbal protocols revealed significant differences in the level of com- prehension between the two graph types. Specifically, a sig- nificant proportion of line graph users was either unable to in- terpret the graphs, or misinterpreted information presented in them. These errors did not occur in the bar graph condition. The difference is explained in terms of the high salience of the lines in line graphs which hinders the correct or full interpre- tation of the relationships depicted. The results of the exper- iment provide a strong rationale for the use of bar graphs to display such three-variable data sets, particularly for a general audience. Keywords: graph comprehension, diagrammatic reasoning, information graphics. Introduction Bar and line graphs are the most commonly used graphical formats for presenting quantitative data, not only for experi- enced practitioners in science, engineering, and business but also for more a general audience in education and the me- dia (Kosslyn, 2006; Zacks, Levy, Tversky, & Schiano, 1998). Within the space of graphical representations bar and line graphs are very close. Because both utilise the Cartesian co- ordinate system, knowledge of the representational properties of this system, as a minimum, allows users to understand how the two diagrams ‘work’ and possibly to extract some basic information from them. Beyond this underlying similarity in representational framework however, the key difference in how data are repre- sented in the two graphs can have profound effects on how the data are understood and interpreted. Line graphs are typically regarded as a form of configural or object display because a single line integrates the individual plotted points into a single object. Features of this object—its slope for example—can indicate relevant information about the entire data set (Car- swell & Wickens, 1990, 1996). In contrast, bar graphs are an example of a separable display as each variable is represented by a single bar. For these reasons, people typically encode bars in terms of their height, interpret them as representing the separate values of nominal scale data and are better at comparing and evalu- ating specific quantities using them (Culbertson & Powers, 1959; Zacks & Tversky, 1999). In contrast, people typically encode lines in terms of their slope (e.g., Simcox, 1983, re- ported by Pinker, 1990), interpret them as representing con- tinuous changes on an ordinal or interval scale (Kosslyn, 2006; Zacks & Tversky, 1999) and are better at identifying trends using line graphs (Schutz, 1961). Not only are people’s conception and interpretation of bar and line graphs different, their actual perception of values de- picted in the two graphs can also vary significantly. In a re- cent study, Peebles (2008) asked people to compare values plotted in bar and line graphs with an average (represented as a line drawn from the y axis parallel to the x axis). Despite the fact that the values being compared were plotted at exactly the same locations in the graphs, bar graph users significantly un- derestimated the size of the plotted value relative to the mean compared to line graph users. This effect was shown to result from a process in which bar graph users’ visual attention was drawn via a figure-ground process to the length of the bars as they extend from the x axis (cf. Pinker, 1990; Simcox, 1983) rather than to the distance between the top of the bar and the mean line, thereby accentuating the perceived difference be- tween them. Because of their different representational properties, guidelines recommend bar and line graphs be used for dif- ferent communicative goals. One such guideline is to use line graphs to display the interactive effects of two independent variables (IVs), each with two levels, on a dependent vari- able (DV; e.g., Kosslyn, 2006, p. 49). This situation is widely encountered in many scientific and engineering contexts and the use of such interaction graphs is taught in a wide range of undergraduate curricula, including psychology. The rationale for using line graphs in such cases is that the different patterns formed by the lines can be rapidly identi- fied by experienced users as indicating particular quantitative relationships between the variables. So, for example, users familiar with the format should be able to recognise an X pat- tern as indicating a crossover interaction and know that two parallel lines indicate no interaction. By contrast, these pat- terns are not as salient in bar graphs but must be constructed by the user by mentally connecting the tops of the bars. Although no doubt useful in such situations, the salience of plotted lines can significantly affect people’s interpretation of the data being presented. For example, Carpenter and Shah (1998) showed that for line graphs, the same data presented from alternative perspectives can lead to different interpreta- 2938