Convergent Design of A Piecewise Linear Neural Network Hema Chandrasekaran and Michael T. Manry Department of Electrical Engineering The University of Texas at Arlington Arlington, TX 76019 e-mail : manry@uta.edu Abstract A piecewise linear neural network (PLNN) is discussed which maps N-dimensional input vectors into M- dimensional output vectors. A convergent algorithm for designing the PLNN from training data is described. The design algorithm is based on a variation of backtracking algorithm known as the ‘branch and bound’ method. The performance of the PLNN is compared with that of a multilayer perceptron (MLP) of equivalent size. The results show that the PLNN is capable of performing as well as an equivalent MLP. I. Introduction Piecewise linear mappings are appealing because of their simplicity. Even though many piecewise linear mappings[1-4] have been described earlier, absolute convergence of the mean square error at every iteration has never been insisted upon. In the proposed paper, we describe a piecewise linear neural network (PLNN) in which (1) each input vector is assigned to the appropriate cluster using a distance measure, and (2) the input vector is multiplied by a matrix to generate an output vector. The network structure and initialization are described in section II. The training algorithm is discussed in section III. Numerical results are given in section IV. II. Network Structure and Initialization A. Notation and Basic Operation The PLNN consists of • N c , N-dimensional cluster center vectors m k where 1 ≤ k ≤ N c , • an M by (N + 1) dimensional matrix W gl • N c matrices W pwl (k) of dimension M by (N + 1) • a distance measure d (⋅), and • 2N parameters μ(i), σ(i) for 1 ≤ i ≤ N . The network structure is shown in figure 1. In a trained network, each input vector x is processed as follows: 1. The input vector x is augmented as (x T :1) T . 2. The input vector elements are normalized as x′(i) = (x(i)- μ(i))/σ(i) where μ(i) , σ(i) are mean and standard deviation respectively of the input vector elements. 3. Input vector x is assigned to the kth cluster such that d(x′ ′ ′, m k ) = n min d(x′ ′ ′, m n ) using the distance measure d(⋅). 4. The output of the network is y = W gl ⋅ x + W pwl (k) ⋅ x′ ′ ′ x + + d() Fig. 1 Piecewise Linear Neural Network Structure In the following subsections we summarize steps in the initialization of our training algorithm. B. Global Mapping Our goal here is to obtain a preliminary linear mapping between input and output vectors. Subtracting this linear mapping from the training data prevents the network modules from wasting clusters on linear mapping. This in turn, improves the efficiency of the network. The input feature vectors x p are normalized as in section II. The global linear mapping matrix W gl is extracted at this stage. Modified target vectors are found as t p ′ ′ ′ = t p - W gl ⋅ x p .