Empirical Study of Object-Layout Strategies and Optimization Techniques Natalie Eckel ? and Joseph (Yossi) Gil ?? Department of Computer Science Technion—Israel Institute of Technology Technion City, Haifa 32000, Israel natalie | yogi@cs.Technion.AC.IL Abstract. Although there is a large body of research on the time overhead of object oriented programs, there is little work on memory overhead. This paper takes an empirical approach to the study of this overhead, which turns out to be significant in the presence of multiple inheritance. We study the performance, in terms of overhead to object size of three compilation strategies: separate compilation, whole program analysis, and user annotations as done in C++. A variant to each such strategy is the inclusion of pointers to indirect virtual bases in objects. Using a database of several large multiple inheritance hierarchies, spanning 7000 classes, several application domains and different programming languages we find that in all strategies there are certain classes which give rise a large number of compiler generated fields in their object layout. We then study the efficacy of the recently introduced inlining and bidirectional object layout optimization techniques, and show that an average saving of close to 50% in this overhead can be achieved. 1 Introduction One of the most difficult challenges in the implementation of object oriented program- ming languages is to efficiently realize together multiple inheritance (MI) and dynamic dispatch. Indeed, it was believed [8] that it was impossible to efficiently introduce MI into C++, until proved otherwise [26]. The main difficulty is due to the fact that in presence c b a d c b a (a) (b) Fig. 1: Multiple inheritance and virtual inheritance. of MI, an object of class a, can also serve as an instance of classes b and c, where no inheritance relationship exists between b and c, as in Fig. 1(a). The problem is complica- ted further in case b and c have a common ancestor, d, as depicted in Fig. 1(b). In this case d is called a virtual base, (of b, c, and a), and the inheritance links between b and d and c and d are called virtual inheritance 1 (VI). To support this highly polymorphic nature of objects in presence of MI, the compiler is required to generate extra data fields in objects. Benchmarking shows that this overhead is significant [28]; and it can even double the memory footprint of some applications. ? Contact author ?? Work done in part during a visit to the IBM T.J.Watson Research Center 1 also shared inheritance Elisa Bertino (Ed.): ECOOP 2000, LNCS 1850, pp. 394–421, 2000. c Springer-Verlag Berlin Heidelberg 2000