SpMT WaveCache: Exploiting Thread-Level Parallelism in WaveScalar Songwen Pei, Baifeng Wu, Min Du, Gang Chen School of Computer Science, Fudan University, 200433 Shanghai, China {swpei,bfwu}@fudan.edu.cn Leandro A.J. Marzulo, Felipe M.G. Franca Systems Engineering and Computer Science Program, COPPE, Rio de Janeiro, Brazil {lmarzulo,felipe}@cos.ufrj.br Abstract Speculative Multithreading (SpMT) increases the performance by means of executing multiple threads speculatively to exploit thread-level parallelism. By combining software and hardware approaches, we have improved the capabilities of previous WaveScalar ISA on the basis of Transactional Memory system for the WaveCache Architecture. Threads are extracted at the course of static compiling, and speculatively executed as a thread-level transaction that is supported by extra hardware components, such as Thread-Context-Table (TCT) and Thread-Memory- History (TMH). We have evaluated the SpMT WaveCache with 6 real benchmarks from SPEC, Mediabench and Mibench. On the whole, the SpMT WaveCache outperforms superscalar architecture ranging from 2X to 3X, and great performance gains are achieved over original WaveCache and Transactional WaveCache as well. 1. Introduction As the increasing of uniprocessor’s frequency, the exploitable space in superscalar uniprocessor is limited and the side effects such as wire delay, communication expenditure, design complexity, and power consumption become severer and severer. Therefore, in the past few years, researchers and processor manufacturers, en masse, are focusing on multi-core systems or multiple processing cores/elements in a chip instead of superscalar uniprocessor. Thus, the Moore’s Law will work well in the following long running. However, in the past few years, the design of processor was mostly limited to how to exploit instruction-level parallelism (ILP), data-level parallelism (DLP) and thread-level parallelism (TLP) for von Neumann architectures. However, from the view of dataflow computer architects, dataflow architecture is a radical alternative to von Neumann architecture, which is driven by the available input operands instead of instructions. We re- exploited speculative multithreading technologies based on the dynamic dataflow architecture WaveScalar [1,2], which eliminates the bottlenecks of von Neumann architecture by dataflow firing rules and decoupled distributed storage system rather than program counter and centralized storage system. Speculative Multithreading (SpMT) increases the performance by means of executing multiple threads speculatively to exploit thread-level parallelism on the basis of transactional memory system built for the micro-architecture WaveCache of WaveScalar’s ISA. Different from previous speculative multithread schemes, our approaches are built on the dynamic dataflow architecture and decoupled transactional memory system. The main contributions in this paper are (1) spawning threads in the course of compiling; (2) adding hardware units/components to exploit multithread execution based on the transactional memory system; (3) evaluating performance of SpMT WaveCache with 6 real benchmarks from SPEC, Mediabench and Mibench and revealing high speedups behind speculative multithreading. The rest of this paper is organized as follows: Section 2 introduces the related work about speculative multithreading, and WaveScalar is presented in the Section 3. In Section 4, the main contributions are described in detail. Then the experiment results and performance evaluation are shown in the Section5. At last, we conclude our work. 2. Related Work Thread-level speculation is a unanimous technique that can greatly enhance performance by optimistically compiling applications despite data hazards or control dependencies and executing them out-of-order. 2009 World Congress on Computer Science and Information Engineering 978-0-7695-3507-4/08 $25.00 © 2008 IEEE DOI 10.1109/CSIE.2009.35 530 2009 World Congress on Computer Science and Information Engineering 978-0-7695-3507-4/08 $25.00 © 2008 IEEE DOI 10.1109/CSIE.2009.35 530 2009 World Congress on Computer Science and Information Engineering 978-0-7695-3507-4/08 $25.00 © 2008 IEEE DOI 10.1109/CSIE.2009.35 530