An Analysis of Dynamic Scheduling Techniques for Symbolic Applications Alessandra Costa, Alessandro De Gloria, Pa010 Faraboschi and Mauro Olivieri University of Genoa - DIBE Via Opera Pia lla, 16145 Genova, Italy Abstract Instruction-level parallelism in a single stream of code for non-numerical applications has been the subject of many recent researches. This work extends the analysis to sym- bolic applications described with logic programming. In particular, we analyze the eflects on performance of speculative execution, memory alias disambiguation, re- naming and flow prediction. The obtained results indicate that we can reach a sustained parallelism of 4 (comparable with imperative languages), with the proper optimizations. We also show a comparison between static and dynamic scheduled approaches, outlining the conditions under which a dynamic solution can reach substantial improvements over a static one. In this way, we point out some impor- tant optimitations and parameters of a dynamic schedul- ing approach, indicating a guideline for future architectural implementations. 1 Introduction Architectural approaches that exploit instruction-level parallelism from a single instruction stream play an impor- tant role in the improvement of uniprocessor performance. In particular, advances in compilation techniques [3, 111 have demonstrated that VLIW architectures [4] can reach interesting performance on numerical code, but show con- siderable limitations when applied to non-numerical ap- plications. This has lead to the development of dynamic scheduling approaches [7] that try to identify parallelism at execution time through the use of more complex control parts. Instruction-level parallelism in non-numerical applica- tions has been the subject of recent researches [l, 8, 12,141, that have produced different results about the amount of exploitable concurrency, depending on the adopted compu- tational paradigm and the considered hypotheses on data and control dependencies. As most researches have focused their attention to im- perative languages (i.e. C and Fortran), the purpose of this work is to extend the analysis to another computational paradigm applied to the class of non-numerical applica- tions. In particular, we have chosen the logic program- ming paradigm, and Prolog as the target language, since it represents an interesting alternative way to approach a symbolic problem. As recent studies have shown [6,13], the performance of sequential Prolog is getting closer to imperative languages. From thii standpoint, there is a renewal of interest in us- ing Prolog for non-numerical applications. On the other hand, the trend in general purpose computa- tion toward instruction-level parallelism poses the question whether we can find the same amount of parallelism of im- perative languages with a logic programming approach. Previous works [2] on static parallelism in Prolog have demonstrated that global compilation techniques (i.e. Trace Scheduling) can only reach degrees of parallelism be- tween 2 and 3. This is due to the nature of the abstract execution model of the languages, and, in particular, to: l the difficulty of operating a successful alias analysis, for the absence of array data structures; l the difficulty of managing loops that operate on pointer data structures, and in particular the impos- sibility of knowing the effect of a traversal; l the high frequency of branch instructions (14% [2]), that requires aggressive speculative execution to ex- ploit concurrency; The purpose of this work is to report a set of evaluations and to show tradeoffs among the various options available in a dynamic scheduling approach to Prolog. We are also interested in finding out what are the real ad- vantages of dynamic against static scheduling and under which hypotheses one of the two approaches is better. Finally, we want to determine if the conclusions reported for imperative languages about instruction level paral- lelism can be applied to other computational paradigms like logic programming. 1.1 Instruction-Level Parallelism in Non- Numerical Applications Many researchers have investigated the possibility to ex- ploit instruction-level parallelism, both with static and dy- namic aproaches. The results have shown significant dif- ferences, depending on the target application. If we consider numerical (vectorizable) programs, the achievable speed-up can grow as high as 90 [lo] even with static parallelism extraction techniques. On the other hand, if we look at non-vectorizable ap- 185 10724451/93 $3.00 0 1993 IEEE