827
S
System Theory: From Classical State Space to
Variable Selection and Model Identification
Diego Liberati
Italian National Research Council, Italy
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
INTRODUCTION
System Theory is a powerful paradigm to deal with
abstract models of real processes in such a way to
be accurate enough to capture the salient underly-
ing dynamics while keeping the mathematical tools
easy enough to be manageable. Its typical approach
is to describe reality via a reduced subset of ordinary
differential equations (ODE) linking the variables. A
classical application is the circuits theory, linking the
intensive (voltage) and extensive (current) variables
across and through each simplified element by means
of equilibrium laws at nodes and around elementary
circuits. When such relationships are linear (like in
ideal capacitors, resistances, and inductors, just to
stay in the circuit field), a full battery of theorems
does help in understanding the general properties of
the ODE system. Positive systems, quite often used in
compartmental processes like reservoirs in nature and
pharmacologic concentration in medical therapy, enjoy
most of the properties of the linear systems, with the
nonlinear constraint of non negativity. More general
nonlinear systems are less easily treatable unless a
simple form of nonlinearity is taken into account like
the ideal characteristic of a diode in circuit theory. When
the physics of the process is quite known, like in the
mentioned examples, it is quite easy to identify a small
number of variables whose set would fully describe the
dynamics of the process, once their interrelations are
properly modeled: this is the classical way to approach
such a problem.
Nowadays, on the other side, new fields are grow-
ing up, like bioinformatics, where, instead, many data
are collected over several possibly correlated variables
whose joint dynamics would follow a law not a priori
known nor easily understandable on the basis of the
state-of-the-art knowledge. Given the opportunity to
have so much data not easy to correlate by the human
reader, but probably hiding interesting properties,
one of the typical goals one has in mind is to face the
problem on the basis of a hopefully reduced meaning-
ful subset of the measured variables. The complexity
of the problem makes it thus worthwhile to resort to
automatic classification procedures in order to pre-
process the collected data. Then, the original question
does arise of reconstructing the synthetic mathematical
model, capturing the most important relations between
variables, in order to infer their hidden relationships,
like in systems biology.
BACKGROUND
The introduced tasks of selecting salient variables
and identifying their relationships from data may be
sequentially accomplished with various degrees of suc-
cess in a variety of ways. Principal components order
the variables from the most salient to the least one, but
only under a linear framework. Partial least squares
do allow extension to nonlinear models, provided
that one has prior information on the structure of the
involved nonlinearity; in fact, the regression equation
needs to be written before identifying its parameters.
Clustering may operate even in an unsupervised way
without the a priori correct classification of a training
set (Boley, 1998). Neural networks are known to learn
the embedded rules with the indirect possibility (Taha
& Ghosh, 1999) to make rules explicit or to underline
the salient variables. Decision trees (Quinlan, 1994) are
a popular framework providing a satisfactory answer
to the recalled needs.
Four main general purpose approaches will be briefly
discussed in the present article. In order to reduce the
dimensionality of the problem, thus simplifying both
the computation and the subsequent understanding
of the solution, the critical problems of selecting the
most salient variables must be solved. This step may
already be sensitive, pointing to the very core of the
information to look at. A very simple approach is to
resort to cascading a divisive partitioning of data or-
thogonal to the principal directions—PDDP—(Boley,
1998) already proven to be successful in the context of