Supporting Multidimensional Arrays in Java Jos´ e E. Moreira Samuel P. Midkiff Manish Gupta jmoreira,smidkiff,mgupta @us.ibm.com IBM T. J. Watson Research Center Yorktown Heights, NY 10598-0218 Abstract The lack of direct support for multidimensional arrays in Java TM has been recognized as a major de- ficiency in the language’s applicability to numerical computing. It has been shown that, when augmented with multidimensional arrays, Java can achieve very high-performance for numerical computing through the use of compiler techniques and efficient implementations of aggregate array operations. Three ap- proaches have been discussed in the literature for extending Java with support for multidimensional arrays: class libraries that implement these structures; relying on the JVM to recognize those arrays of arrays that are being used to simulate multidimensional arrays; and extending the Java language with new syntactic constructs for multidimensional arrays and directly compiling those constructs to byte- code. This paper presents a balanced presentation of the pros and cons of each technique in the areas of functionality, language and virtual machine impact, implementation effort, and effect on performance. We show that the best choice depends on the relative importance attached to the different metrics, and thereby provide a common ground for a rational discussion of the technique that is in the best interests of the Java community at large, and the growing community of numerical Java programmers. 1 Introduction The multidimensional array (or multiarray for short) is an intuitive concept for numerical programmers in Fortran and C. Multiarrays are -dimensional rectangular collections of elements. A multiarray is charac- terized by its rank (number of dimensions or axes), its elemental data type (all elements of a multiarray are of the same type), and its shape (the extents along its axes). Elements of a multiarray are identified by their indices along each axis. Let a -dimensional multiarray of elemental type have extent along its -th axis, = 0,..., -1. Then, a valid index along the -th axis must be greater than or equal to zero and less than . Our definition of multiarrays also includes the following: The type, rank, and shape of a multiarray are immutable during its lifetime. We will see that this immutability property has important performance implications. The Java Programming Language TM does not support true multidimensional arrays. This has been rec- ognized as a major deficiency in Java’s applicability to numerical computing. Whereas the more recent versions of Fortran have significantly enhanced support for multidimensional arrays, Java provides only single-dimensional arrays. To some extent, it is possible to simulate multidimensional arrays with arrays of arrays. For example, double[][] is an array of one-dimensional arrays of doubles, which can be used to simulate two-dimensional arrays. Figure 1(a) illustrates the concept of arrays of arrays, simulating a two-dimensional multiarray. This approach, however, leaves much to be desired. Arrays of arrays are not necessarily rectangular, and can lead to both inter- and intra-array aliasing, as shown in Figure 1(b). Furthermore, the structure of arrays of arrays can change during a computation. These 1