A METAL and VIA Maskset Programmable VLSI Design Methodology using
PLAs
Nikhil Jayakumar Sunil P Khatri
Department of EE, Texas A&M University, College Station TX 77843.
Abstract
In recent times there has been a substantial increase in the cost and
complexity of fabricating a VLSI chip. The lithography masks them-
selves can cost between $1M and $3M. It is conjectured that due to
these increasing costs, the number of ASIC starts in the last few years
has declined. In this paper, we address this problem by using an array
of dynamic PLAs which require only METAL and VIA mask cus-
tomization in order to implement a new design. This would allow sev-
eral similar-sized designs to share the same base set of masks (right up
to the metal layers) and only have different METAL and VIA masks.
We have implemented our methodology for both combinational and
sequential designs, and demonstrate that our approach strikes a rea-
sonable compromise between ASIC and field programmable design
methodologies in terms of placed-and-routed area and delay. Our
method has a 2.89× (3.58×) delay overhead and a 4.96× (3.44×)
area overhead compared to standard cells for combinational (sequen-
tial) designs.
1 Introduction
With the relentless reduction in the minimum feature sizes of mod-
ern Deep Sub-micron (DSM) VLSI fabrication processes, the com-
plexity of fabrication is increasing at an alarmingly rate. Simultane-
ously, the number and cost of a full set of masks has been increasing
rapidly. It is not uncommon for a full set of lithography masks to cost
over $1-3M [1, 2]. This change has contributed to a roughly 25%
reduction [2] in Application Specific Integrated Circuit (ASIC) de-
sign starts in the last 7 years. It is believed that cell based ASICs are
becoming prohibitively expensive except for very high volume appli-
cations [2].
In this paper, we introduce a new VLSI design approach to address
this problem and minimize the non-recurring expense (NRE) involved
with IC design. Our approach utilizes an array of precharged Pro-
grammable Logic Arrays (PLAs) with flip-flops co-located at their
outputs, as its underlying circuit structure. We envision that a manu-
facturer would stock such arrays (of varying sizes), pre-processed up
until the metalization step. To create an ASIC for a given design, the
manufacturer would technology map this design to the smallest avail-
able array. After technology mapping and routing of the design, the
METAL and VIA masks (the only masks that require changes) would
be generated and used to customize or personalize the array to im-
plement the design. At this point, the manufacturer could process the
remaining masks, to obtain the final design. Alternately, the manu-
facturer could perform all steps of processing, using old masks for all
other layers and the new METAL and VIA masks for customization
of the design. The latter option might be used by manufacturers who
do not have experience in warehousing partially completed wafers.
1
.
Since all other masks except METAL and VIA masks remain unal-
1
Also, as the industry starts to move toward the highly absorptive and fragile low-k
dielectric materials in the metal stack, the shelf life and the nature of contamination risks
are not well known [3]
tered, the manufacturer can realize the design in a low cost manner, by
amortizing the bulk of the NRE over a large number of designs. Fur-
ther, the manufacturer could spend a considerable effort in optimizing
these designs for maximum yield, and this effort would be amortized
over a large number of designs that share the common masks. Addi-
tionally, such an approach could result in a reduced processing time
for a new design. Processing for a modern IC can take anywhere from
3 weeks up to a few months [4]. This methodology can therefore help
reduce design turnaround time by stockpiling wafers which have been
processed up to the metalization step. Also, this methodology simpli-
fies the task of engineering change. When a bug is discovered and
the design needs to be modified, our methodology would reduce the
cost and time to modify the design (since it requires only METAL and
VIA mask changes).
After METAL and VIA mask customization, the design would be
transformed into a network of precharged PLAs [5]. Such an imple-
mentation methodology was demonstrated to be fast and area-efficient
compared to a standard cell approach. As shown in [5], for a network
of PLAs there is a more direct relationship between the cost func-
tion being optimized for during logic synthesis (literal count), and
the actual PLA implementation. In a standard cell based flow, there
is an intervening technology mapping step, which often negates the
benefits of technology-independent logic optimization. A network of
PLAs on the other hand, allows us to carry forward the benefits of
technology-independent multi-level logic synthesis
We leverage this feature of the network of PLA design style in our
work. In contrast to the work of [5], we are able to handle both com-
binational and sequential designs. Also, the mask programmability
feature of our approach makes our PLAs design and layout quite dif-
ferent from those in [5].
In recent times, PLAs have experienced a renewed interest as a
circuit implementation style for high-performance designs. The IBM
Gigahertz processor [6] utilized PLAs
2
to implement control logic,
due to their high speed and because they provide the ability to quickly
implement and modify the design.
The remainder of this paper is organized as follows. The next Sec-
tion 2 talks about methods similar to our own. Section 3 describes
our design flow, while Section 4 describes our experimental results.
Finally, in Section 5, we make concluding comments and discuss fur-
ther work that needs to be done in this area.
2 Previous Work
In the past, gate arrays [7, 8] have been used as an implementation
method in which a design can be personalized via and metal cus-
tomization. This approach was popular until standard cell based de-
sign became the dominant means to design ICs. The speed of our ap-
proach is based on the fact that wiring is embedded inside our PLAs.
This is not true for gate arrays. Also, the P and N diffusions in any
row of the gate array need to be separated, resulting in larger area
2
Note that single PLAs used as opposed to a network of PLAs
0-7803-8702-3/04/$20.00 ©2004 IEEE. 590