An ILP Formulation for Task Mapping and
Scheduling on Multi-core Architectures
Ying Yi, Wei Han, Xin Zhao, Ahmet T. Erdogan and Tughrul Arslan
University of Edinburgh, The King's Buildings, Mayfield Road, Edinburgh, EH9 3JL, UK
Abstract-Multi-core architectures are increasingly being adopted
in the design of emerging complex embedded systems. Key issues
of designing such systems are on-chip interconnects, memory
architecture, and task mapping and scheduling. This paper
presents an integer linear programming formulation for the task
mapping and scheduling problem. The technique incorporates
profiling-driven loop level task partitioning, task transformations,
functional pipelining, and memory architecture aware data
mapping to reduce system execution time. Experiments are
conducted to evaluate the technique by implementing a series of
DSP applications on several multi-core architectures based on
dynamically reconfigurable processor cores. The results
demonstrate that the proposed technique is able to generate
high-quality mappings of realistic applications on the target
multi-core architecture, achieving up to 1.3x parallel efficiency by
employing only two dynamically reconfigurable processor cores.
I. INTRODUCTION
An important trend in embedded systems is the use of
multi-core architectures to meet application’s functional and
performance requirements. Multi-core designs offer high
performance and flexibility, at the same time promise low-cost
and power-efficient implementations. However, the
semiconductor industry is still facing several other
technological challenges with multi-core systems. Important
issues in multi-core designs are the communication
infrastructure, memory architecture, and task mapping and
scheduling. In multi-core architectures, the performance of the
entire system is affected by the execution order of tasks and
communications. It is well known that task mapping and task
scheduling are highly inter-dependent. Therefore the two issues
need to be handled together in order to obtain efficient
mapping and scheduling.
Dynamic reconfigurable (DR) processor combines the
flexibility of FPGAs with the programmability found in general
purpose processors (CPUs/DSPs) in a unified and easy
programming environment. It is a strong candidate for
multi-core systems. In our proposed embedded multi-core
platform which has several DR processors [1], the shared
memory heavily affects the execution time and power
consumption. The time of data transmission between different
processors must be considered during scheduling such that the
design result can conform to the real situation. In addition, in
order to meet the system throughput constraints, the design is
pipelined to construct more efficient architectures. Pipelining
divides the design into concurrently executing stages, thus
increasing the throughput.
In multi-core architectures all parallel tasks in an
application have the potential to be executed simultaneously.
However the number of such tasks may exceed the number of
available processors. Therefore task mapping is required to
assign the parallel tasks to the available processors. In the past,
task merging and task replication have been proposed with the
goal of re-allocating tasks when performance bottlenecks are
met. Since task merging requires more local memory and task
replication needs more processors to implement the same task
[2], a multi-core architecture which does not feature sufficient
memory and processors will severely limit the available
mapping options using the existing methodology.
Application development on multi-core architectures
requires the designer, or automated tool, to divide tasks
between available processors and to determine data mappings
for the required memory elements. A SystemC-based
simulation framework for mapping an application to a platform
and evaluating its performance has been presented in [3]. The
authors in [4, 5] have introduced scheduling and mapping
parallel applications onto an MPSoC platform. Mapping
solutions for bus-based and NoC-based MPSoCs have been
described in [6] and [7]. Some automated system-level
mapping techniques for application development on network
processors have also been proposed [8].
This paper addresses the problem of automated application
mapping and scheduling on DR processor based multi-core
architectures. An Integer Linear Program (ILP) based approach
is proposed for loop level task partitioning, task mapping and
pipelined scheduling while taking the communication time into
account for embedded applications. The efficacy of the
technique is demonstrated by a series of DSP applications.
The paper is organized as follows: Section 2 introduces the
target DR processor as well as the target multi-core
architecture. Section 3 describes the task mapping
methodology. Section 4 gives a more detailed description of
the problem addressed in this paper. Section 5 describes the
proposed ILP based approach to solve the problem. The
experimental results are given in section 6 followed by
conclusions in section 7.
II. TARGET MULTI-CORE ARCHITECURE
Some applications demand a closer interconnection
between the participating processors to achieve the required
performance. Such a communication can be realised using
distributed shared register files. The target multi-core platform
is designed for DSP applications, which typically have
intensive computations and a stream of input data. The
architecture described in a previous work [2] consists of a
selectable number of DR processors, which communicate with
a shared memory through a full crossbar network. This
architecture has been extended and modified by incorporating
the shared register file into the system memory architecture in
order to support the loop level parallelism proposed in this
978-3-9810801-5-5/DATE09 © 2009 EDAA