A Parallelisation of Ray Tracing with Diffuse Interreflection Erik Reinhard Faculty of Technical Mathematics and Informatics, Delft University of Technology Julianalaan 132, 2628BL Delft, The Netherlands Email: erik@duticg.twi.tudelft.nl Abstract Ray tracing is a powerful technique to generate realistic images of 3D scenes. However, the rendering of complex scenes under advanced lighting circumstances may easily exceed the pro- cessing and memory capabilities of a single workstation. Distributed processing offers a solution if the algorithm can be parallelised in an efficient way. In this paper a hybrid scheduling approach is presented that combines demand driven and data parallel techniques. Which tasks to process demand driven and which data driven, is decided by the data intensity of the task and the amount of data locality (coherence) that will be present in the task. By combining demand driven and data parallel tasks, a good load balance is achieved, while at the same time spreading the commu- nication evenly across the network. This leads to a scalable and efficient parallel implementation of the ray tracing algorithm with fairly no restriction on the size of the model data base to be rendered. 1 Introduction Ray tracing is a powerful and widely used technique to generate highly realistic images of 3D scenes. It calculates the reflection of light in a scene by tracing the path of light backwards from the viewpoint towards the light sources. Ray tracing is ideally suited for lighting simulations. Applications include a.i. architectural design, theatre and greenhouse lighting simulations and traffic lighting simulations (tunnels, crossings). An example of an indoor lighting simulation using the Radiance lighting simulation package (Ward 1994) is given in figure 1. Rendering such complex scenes may easily exceed the processing and memory capabilities of a single workstation, especially if diffuse reflection needs to be computed. This is the case in for example Radiance. Distributed processing may offer a solution to excessive processing demands provided the algo- rithm can be parallelised in an efficient way. A parallel rendering algorithm should both minimise communication requirements and balance the workload well over the processors available. Various mechanisms exist to accomplish this. In data parallel solutions, objects are distributed over processors and ray tasks are handled by those processors that possess the relevant data. Objects may be dynamically redistributed to provide a better load balance, but this introduces extra object communication. Furthermore, redistributing objects too quickly may introduce oscillations, so that object communication becomes a bounding factor. However, if objects are not distributed quickly enough, a sub-optimal load balance is achieved.