J Sign Process Syst (2011) 65:245–259
DOI 10.1007/s11265-011-0606-x
Design Methodology for Offloading Software
Executions to FPGA
Tomasz Patyk · Perttu Salmela · Teemu Pitkänen ·
Pekka Jääskeläinen · Jarmo Takala
Received: 29 January 2011 / Revised: 4 July 2011 / Accepted: 4 July 2011 / Published online: 30 July 2011
© Springer Science+Business Media, LLC 2011
Abstract Field programmable gate array (FPGA) is a
flexible solution for offloading part of the computa-
tions from a processor. In particular, it can be used
to accelerate an execution of a computationally heavy
part of the software application, e.g., in DSP, where
small kernels are repeated often. Since an application
code for a processor is a software, a design method-
ology is needed to convert the code into a hardware
implementation, applicable to the FPGA. In this paper,
we propose a design method, which uses the Transport
Triggered Architecture (TTA) processor template and
the TTA-based Co-design Environment toolset to au-
tomate the design process. With software as a start-
ing point, we generate a RTL implementation of an
application-specific TTA processor together with the
hardware/software interfaces required to offload com-
This work has been supported by the Academy of Finland
under research grant decision 128126.
T. Patyk (B ) · P. Salmela · T. Pitkänen ·
P. Jääskeläinen · J. Takala
Department of Computer Systems,
Tampere University of Technology,
P. O. Box 553, 33101, Tampere, Finland
e-mail: tomasz.patyk@tut.fi
P. Salmela
e-mail: perttu.salmela@gmail.com
T. Pitkänen
e-mail: teemu.pitkanen@tut.fi
P. Jääskeläinen
e-mail: pekka.jaaskelainen@tut.fi
J. Takala
e-mail: jarmo.takala@tut.fi
putations from the system main processor. To exem-
plify how the integration of the customized TTA with a
new platform could look like, we describe a process of
developing required interfaces from a scratch. Finally,
we present how to take advantage of the scalability of
the TTA processor to target platform and application-
specific requirements.
Keywords Application-specific integrated circuits ·
Hardware accelerator · Computer aided engineering ·
System-on-a-chip · Coprocessors ·
Field programmable gate arrays
1 Introduction
The growing complexity of software applications run-
ning on the portable devices like mobile phones, smart
phones, PDAs etc., call for the increase in the process-
ing power offered by their CPUs. Typically, a RISC
processor employed as a general purpose processing
unit does not provide enough computational resources
and the use of a specialized hardware accelerator is
inevitable. A DSP co-processor is a common solution
to speed up multimedia applications. Nevertheless how
powerful the DSP processor is, a dedicated hardware
will do the same task faster, consume less power, and
take smaller silicon area.
Reconfigurable hardware in form of field program-
mable gate array (FPGA) makes an excellent solution
for increasing the performance of an embedded system,
as part of the application code can be offloaded from
the processor. The performance increase requires care-
ful planning though. Quite often the overhead of such
arrangements, e.g., cost of data transfers between a