HPJava: Eﬃcient Compilation and Performance for HPC Han-Ku Lee, Bryan Carpenter, Geoﬀrey Fox, Sang Boem Lim {hkl, dbc, gcf, slim}@grids.ucs.indiana.edu Pervasive Technology Labs at Indiana University Bloomington, IN 47404-3730 Computational Science and Information Technology at Florida State University Tallahassee, FL 32306 Abstract We review the authors’ HPJava programming en- vironment 1 , and compare and contrast with systems like HPF. Because the underlying programming lan- guage is Java, and because the HPJava programming model relies centrally on object-oriented run-time de- scriptors for distributed arrays, the achievable perfor- mance has been somewhat uncertain. In the latest pub- lication [15], we have proved that HPJava individual node performance is quite acceptable. Now the HPJava performance on multi-processor systems is critical is- sue. We argue with simple experiments that we can in fact hope to achieve performance in a similar ballpark to more traditional HPC languages. 1. Introduction In the earlier publications such as [6, 5], we argued that HPJava should ultimately provide acceptable per- formance to make it a practical tool for HPC. To prove our arguments, we started benchmarking HPJava on a single processor (i.e. a node). Why a single node? There were two reasons why HPJava node performance was uncertain. The ﬁrst one was that the base lan- guage is Java. We believe that Java is a good choice for implementing our HPspmd model. But, due to ﬁnite development resources, we can only reasonably hope to use the available commercial Java Virtual Machines (JVMs) to run HPJava node code. HPJava is, for the moment, a source-to-source translator. Thus, HP- 1 This work was supported in part by the National Science Foundation Division of Advanced Computational Infrastructure and Research, contract number 9872125. Java node performance depends heavily upon the third party JVMs. The second reason was related to nature of the HPspmd model itself. The data-distribution di- rectives of HPF are most eﬀective if the distribution format of arrays (“block” distribution format, “cyclic” distributed format, and so on) is known at compile time. This extra static information contributes to the generation of eﬃcient node code 2 . But, HPJava distri- bution format is described by several objects associated with the array—collectively the Distributed Array De- scriptor. This makes the implementation of libraries simple and natural. Thus, from [15], we proved that HPJava node per- formance is quite acceptable, compared with C, FOR- TRAN, and ordinary Java: especially Java is no longer quite slower than C and FORTRAN; it has almost the same performance. Moreover, we veriﬁed if our library- based HPspmd programming language extensions can be implemented eﬃciently in the context of Java. In this paper, we discuss some features, run-time library, and compilation strategies including optimiza- tion schemes for HPJava. Moreover, we experiment on simple HPJava programs against C, fortran, and Java programs. 2. HPJava Language 2.1. HPspmd Programming Model HPJava [10] is an implementation of what we call the HPspmd programming language model. HP- 2 It is true that HPF has transcriptive mappings which allow code to be developed when the distribution format is not known at compile time, but arguably these are an add-on to the basic language model rather than a central feature. 1