Thermal-Safe Schedule Generation For System-on-Chip Testing Rajit Karmakar and Santanu Chattopadhyay Dept. of Electronics & Electrical Comm. Engineering Indian Institute of Technology Kharagpur, India, Kharagpur, 721302 Email: {rajit,santanu}@ece.iitkgp.ernet.in Abstract—This paper presents a thermal safe test scheduling strategy for System-on-Chip (SoC). While most of the existing strategies rely on some approximate thermal models to avoid the time consuming online thermal simulations, the present work proposes to use a superposition principle-based thermal model, which can estimate the temperature of the cores quite accurately, yet fast, without invoking thermal simulation inside the schedule generation process. The thermal model, along with a window- based peak power model, has been incorporated into a Particle Swarm Optimization (PSO) based meta search technique to generate the test schedules. In contrast to the existing works, the introduction of new SoC benchmarks with detailed information regarding power and floorplan enables us to observe exact thermal behaviors of the cores. Experimental results on these newly proposed benchmarks show the superiority of our thermal model over the existing ones. KeywordsSystem-on-Chip; Test scheduling; Superposition principle based thermal model; Particle Swarm Optimization; Bin packing. I. I NTRODUCTION With the increasing demand for high performance and low- power chips, present day’s semiconductor industry is heading towards smaller feature sizes and reduced chip area. Device dimensions are reducing drastically, while the system designs are becoming more and more complex. The problems related to power and thermal issues are becoming more prominent in the System-on-Chip (SoC) designs. Testing of such a complex chip has become a major challenge for the test engineers. The test mode power is often 30 times higher than the functional mode. Not only the high power consumption, but also the high peak temperature during testing is causing serious threat to the chip. Due to the non-uniformity in the spatial power distribution, the temperature may not be equal throughout the chip. High power density of a particular core may create localized heating, called hotspots [1]. These hotspots may lead to decrease in the reliability of the circuit and even permanent damage of the chip, due to thermal runaway [2]. A test engineer has to pay special attention towards the power and thermal safety, at the time of the development of test infrastructure of the SoC. On the other hand, shrinking product development cycle requires to reduce the test time of the SoC. This can be achieved via a proper scheduling of the tests for the cores. Development of test infrastructure and schedule of a SoC under resource, power and thermal constraints can be described as follows. A test engineer has to (i) partition the available test resources and allocate to the cores and (ii) decide upon the core ordering in the test schedule, with an objective to reduce the overall Test Application Time (TAT). Moreover, at any point of time in the schedule, the total power consumed by all the cores tested in parallel, must not exceed a certain pre-defined system level power limit and the peak temperature of any core must not violate the maximum allowable temperature limit. To ensure the thermal-safety during testing, the temperature of the cores, in the scheduling interval needs to be computed, which requires online thermal simulation (i.e. at the time of schedule generation process). However, one major drawback of the thermal simulators like HotSpot [3] is their execution time, which restricts us to invoke thermal simulators inside any meta-search technique, that are often used to find the optimal test schedules for the SoCs, with large number of embedded cores. An alternative solution is to incorporate a thermal model, which can predict the temperature of the cores, without integrated thermal simulations. Several such approaches [4]– [10] have been proposed in the literature. The RC model based approach presented in [9], has tried to maximize the heat dissipation through the lateral neighbourhood of the active cores, in a test session. However, the concept of thermal ground of the idle cores and negligible heat transfer between the neighbouring cores does not hold, as we have reported later in this paper, the temperature of a core largely depends on the neighbouring cores, tested in parallel. Moreover, one common problem with all these thermal models is, due to lots of assumptions about the important parameters like power, floorplan etc, these thermal models may not always predict the temperature accurately. The exact thermal behaviour of a chip requires the exact power profiles of the cores as well as the accurate area and floorplan information of the chip. In the absence of all these information of commonly used ITC’02 benchmarks, most of the work presumes some approximate values for these parameters, which introduce inaccuracy in the thermal behavior of the chip. Thermal simulator like Hotspot follows linear RC thermal model [8]. The linearity of the thermal model can be exploited using the superposition principle. The work presented in [8], tried to exploit the linearity of the Hotspot tool and used a superposition principle based thermal model. As the CPU ex- ecution time is the main bottleneck of the thermal simulations during scheduling, the authors have tried to avoid invoking the thermal simulator in the scheduling process. Instead, they have used the HotSpot [3] tool to create offline thermal profiles of the cores. These thermal profiles are used for the scheduling purpose. This type of thermal model is fast. However, while working with this thermal model [8], we have noticed that, it may result in thermal violations. This, we believe, because of neglecting the pre-schedule temperature increase of cores due to the leakage power and also inefficient modelling of the heating and cooling effects of the cores, which we have discussed elaborately in Section II. To alleviate the inaccuracy of the thermal model of [8], in this paper, we have used a more elaborate thermal model, based on superposition principle. It uses relatively more detailed and accurate thermal information to get an efficient, yet fast 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems 978-1-4673-8700-2/16 $31.00 © 2016 IEEE DOI 10.1109/VLSID.2016.47 460 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems 978-1-4673-8700-2/16 $31.00 © 2016 IEEE DOI 10.1109/VLSID.2016.47 473 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems 978-1-4673-8700-2/16 $31.00 © 2016 IEEE DOI 10.1109/VLSID.2016.47 475