Close Approximations of Sigmoid Functions by Sum of Steps for VLSI Implementation of Neural Networks ① Valeriu Beiu †,‡ , Jan Peperstraete † , Joos Vandewalle † and Rudy Lauwereins †,② † Katholieke Universiteit Leuven, Department of Electrical Engineering Division ESAT, Kardinaal Mercierlaan 94, B-3001 Heverlee, Belgium ‡ on leave of absence from “Politehnica” University of Bucharest Department of Computer Science, Spl. Independentei 313, 77206 Bucharest, România Abstract – This paper is devoted to show that there are simple and accurate ways to compute a sigmoid nonlinearity and its derivative in digital hardware by sum of steps, and that threshold gate implementation of such algorithms are area-efficient when com- pared to other known methods. §1. OVERVIEW The paper starts by describing classical solutions for digital hardware implementation of the non- linear activation functions used by artificial neurons, the accent falling on sigmoid nonlinearities. Fresh results from the known literature are mentioned and shortly compared (§2. Classical Solutions). But even if approximation techniques are used, the computations involved are quite com- plex. That is why we introduce a particular sigmoid function ( §3. A Particular Sigmoid Function). It is not very difficult to show that this particular sigmoid function is equivalent with the classical sigmoid function if the amplification factor (gain) is changed by a constant. As this constant can be used to multiply all the incoming weights, the input-output behavior of one artificial neuron will be the same. The change in weights is not relevant as anyhow they will be represented in a fixed-point format, limited by the accuracy of the technology. We prove that the exact computation of the particular sigmoid function can be done without com- plex operations, as it always gives rise to periodic binary numbers satisfying a very simple rule (§4. Mathematical Considerations). Before investigating on a hardware implementation for the particu- lar sigmoid function, we shall also prove that it cannot be efficiently computed by a systolic or semi-systolic system. The same idea will then be taken to another sigmoid function: the hyperbolic tangent, where a similar, but more complex rule can be deduced ( §5. Other Sigmoid Functions). Two more sigmoid functions are introduced to the reader: the fast sigmoid, and the error function. Approximations for all these functions by modifying the gain of the particular sigmoid function can be an alternative to finding dedicated algorithms. The Scientific Annals, Section: Informatics, vol. 40 (XXXX), no. 1, 1994. 31 ① This research work was partly carried out in the framework of a Concerted Action Project of the Flemish Community, entitled: “Applicable Neural Networks”, and partly supported by a Doctoral Scholarship Grant offered to V. Beiu by KULeuven. The scientific responsibility is assumed by the authors. ② Senior Research Assistant of the Belgian National Fund for Scientific Research.