Automatic Data Visualization for Novice Pascal Programmers Brad A. Myers Ravinder Chandhok Atul Sareen Computer Science Department Camegie Mellon University Pittsburgh, PA 15213-3890 (412) 268-2565 macgnome@ cs.cmu.edu ABSTRACT Previous work has demonstrated that presenting the data structures from programs in a graphical manner can significantly help programmers understand and debug their programs. In most previous systems, however, the graphical displays, called data visualizations, had to be laboriously hand created. The Amethyst system, which runs on Apple Macintosh computers, provides attractive and appropriate default displays for data structures. The default displays include the appropriate forms for literals of the simple types inside type-specific shapes, and stacked boxes for records and arrays. In the near future, we plan to develop rules for layout of simple dynamic data structures (like linked lists and binary trees), and simple mechanisms for creating customized displays. The visualizations are integrated into an advanced programming environment which is used to teach programming methodology at the introductory level. INTRODUCITON Pascal and most other computer languages allow the programmer to define and use a variety of data types. No existing programming environment, however, provides the programmer with similar flexibility when displaying the data structures for debugging or program documentation. Existing debuggers typically use some linear, textual view for the display of any user defined data structure. These linear displays are sufficient for simple types, but grow unwieldy as the data structures become more complicated. This paper describes a new system, called Amethyst, that automatically creates graphical displays for data structures for Pascal programs. Amethyst is part of a MacGNOME [I] programming environment which is used in introductory computer science courses at several universities and high schools. Amethyst stands for A MacGNOME Environment That Helps you See mes. The pictures that Amethyst creates are similar to the pictures used to explain the data structures in popular textbooks (including the text used by the students[2]). The use of graphical pictures to show program data is called data visualization [3]. The particular displays chosen in Amethyst were designed to explain important concepts, such as which dimension of a multi-dimensioned array comes first, and which types are assignment compatible. In addition, a graphic artist participated in the design so the pictures will be visually appealing. Graphical presentations for data structures are important for a number of reasons. Human information processing is clearly optimized for pictorial information, and pictures make the data easier to understand for the programmer. This will make program debugging and program comprehension easier, because the pictures provide a higher level of abstraction that removes a number of irrelevant details. In particular, some of the programming concepts that students have particular difficulty with can be much better explained with pictures. This includes how Pascal file variables work, the difference between value and VAR parameters, and how recursion affects variables. Although it is not one of the goals of the current project, data visualizations would also probably also be helpful for professional programmers since they can be used to abstract out unnecessary details when dealing with large and complex data structures. ENVIRONMENT Amethyst is implemented as part of the MacGNOME project. MacGNOME is a family of programming environments designed for novice programmers. It is perhaps the first family of environments designed specifically with novice computer science education in mind. Based on the structure editor generator part of the Gandalf project [4], MacGNOME environments, which are called GENIES. can be generated for any structured language. GENIEs are unique for three reasons. First, they provide a structure editor interface which automatically inserts the appropriate syntax when the user specifies the type of program structure desired.. Second, they provide multiple views of the program being edited with the number of views limited only by the imagination of the implementor. Examples currently implemented include an outline view, a In addition, GENES also allow the insertion of text into the program, via an incremental parser. Pieces of the program can also be edited textually, if desired. I92 TH0229-5/88/0000/0192$01.00 0 1988 IEEE