Dynamic Data Driven Image Reconstruction Using Multiple GPUs Adeesha Wijayasiri, Tania Banerjee, Sanjay Ranka, Sartaj Sahni and Mark Schmalz Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611 {adeeshaw, tmishra, ranka, sahni, mssz}@cise.uﬂ.edu Abstract—The reconstruction of nxn-pixel Synthetic Aperture Radar imagery using Back Projection algorithm incurs O(n 2 · m) cost, where m is the number of pulses. This paper presents dynamic data driven multiresolution algorithms to speed up SAR backprojection on multiple GPUs. A critical part of this spatially variant reconstruction process is load balancing, which circum- vents asymmetric work assignment. Our algorithms achieve 15 TFLOPS using 128 GPUs. Keywords: Synthetic Aperture Radar, MultiResolution images, GPU, Load Balancing, Longest Processing time, List Assignment I. I NTRODUCTION Synthetic Aperture Radar (SAR) image formation utilizes tensor-product based transformation of radar return pulse histories to yield a spatial representation containing possible target objects. In this paper, we assume a SAR pulse emitter and receiver are located on an airborne platform. This type of SAR based reconstruction is useful since spatial resolution of the reconstructed image is independent of the distance from the pulse emitter to the target, and viewing through obscurants such as clouds and smoke is possible [1]. Frequency domain approaches such as range Doppler imag- ing and time domain processing algorithms such as Backpro- jection have been employed in reconstructing images from SAR pulse data. Thus far, backprojection produces better quality reconstructions than frequency domain algorithms due to support for higher resolution and fewer assumptions about the image, albeit at high computation cost [1], [2]. With improvements in parallel computing, parallelization of backprojection discussed in [2] and [4] can be extended to multiple resolution levels. This is beneﬁcial, for example, in change detection applied to reconstructed SAR imagery, where reduced resolution (and lower computational cost) may be appropriate for background regions, while candidate tar- get regions are rendered at higher resolution. In this paper, techniques for speeding up SAR processing are presented. Speciﬁcally we adopt a dynamic data driven multiresolution approach for multiple GPUs. Due to varying spatial resolution, the naive method of assigning an equal number of image partitions to each GPU does not necessarily yields optimal cost. Thus, we developed efﬁcient reconstruction methods based on spatial tiling that equalize work distribution among multiple GPUs; these techniques are compared with prior work to demonstrate signiﬁcant improvements in speedup. II. BACKGROUND AND PREVIOUS WORK A. SAR Signal Model SAR aims to ﬁnd the distance from the emitter to each detected ground object by measuring the travel time of an electromagnetic pulse, where ground objects have potentially different reﬂectivities. Given a temporally sinusoidal pulse of unitary intensity I 0 , received intensity I of a pulse reﬂected from a ground object of reﬂectivity r is given by I = I 0 · r · e -j2πft , where f denotes the pulse carrier frequency and t is the round trip travel time from emitter to the ground object to the receiver [7][8]. Given the speed of light c and distance d from emitter to ground object, and assuming a monostatic sensing conﬁguration, we can express I = I 0 · r · e -j2πf (2d/c) . Assuming a ground object located at (x 0 ,y 0 ,z 0 ) and instan- taneous location (x(t),y(t),z(t)) of the receiver, the distance between ground object and antenna is given by d =  (x(t) − x 0 ) 2 +(y(t) − y 0 ) 2 +(z(t) − z 0 ) 2 . (1) Linear Frequency Modulation varies the pulse carrier fre- quency linearly from f min to f max . If k denotes the number of frequency samples per pulse, then frequency step size is given by Δf = f max − f min k . Thus, a single pulse has a frequency-varying waveform and the output at the receiver due to the waveform with frequency f k is given by I (f k )= I 0 · r · e -j2πf k (2d/c) . Assuming there are K frequency samples per pulse, output at the receiver for the i th pulse p i is given by I (p i )= K  k=0 I 0 · r · e -j2πf k (2d/c) . (2) The phase associated with an object at the scene origin is set to zero for all frequencies, implying that the distance of that object can be referenced to zero. This differential range is given by ΔR =  (x(t) − x 0 ) 2 +(y(t) − y 0 ) 2 +(z(t) − z 0 ) 2 − d a (3) where d a denotes the distance to the receiver from the scene origin, that is x(t) 2 + y(t) 2 + z(t) 2 . Alias response controls the range of the image scene. Alias free time range is given as 1/Δf where Δf is the frequency step size. Maximum alias 2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) 978-1-5090-5844-0/16/$31.00 ©2016 IEEE