The controlled conjugate gradient type trajectory-following neural net for minimization of nonconvex functions. Amit Bhaya, Fernando Pazos and Eugenius Kaszkurewicz Abstract— This paper presents a unified way to design neural networks characterized as second-order ordinary differential equations (ODE), the trajectories of which can converge to the global minimum of nonconvex scalar functions. These neural networks, sometimes also called continuous-time algorithms, are interpreted as closed-loop control systems and the state feedback design is based on control Liapunov functions. The focus is on a new family of continuous-time versions of con- jugate gradient method, named controlled conjugate gradient (CCG) nets, that generalize heavy ball with friction (HBF) nets. For nonconvex functions, the goal of these nets is to produce trajectories that start from an arbitrary initial point and can escape from local minima, thereby increasing the chances of converging to the global minimum. Several numerical examples are given, using benchmark problems, showing that this escape and subsequent convergence to the global minimum occurs. I. I NTRODUCTION In analog computation, the problem variables are repre- sented by physical variables and some physical laws, such as circuit laws, act on these variables to produce the desired output. A great majority of physical laws are expressed as ordinary differential equations (ODEs) and, in modern terminology, these analog computation methods are referred to as trajectory-following methods, or sometimes as dynamic trajectory methods, since they propose to follow trajectories of suitably chosen ODEs to the solution. In the field of global optimization, trajectory-following methods have a long history, which is briefly and incom- pletely recounted here, since this paper generalizes the HBF method, and thus traces only the evolution of HBF-related methods. The first proposal, inspired by the mechanical analogy of a heavy ball with friction (HBF) under a gravita- tional field was proposed by Polyak [1](also see [2]). The basic idea behind the original HBF method is to choose the mass parameter of the heavy ball to “climb” over local minima, and adjust the friction parameter so that the ball loses enough energy not to shoot past the global minimum. In fact, in all but one of the HBF methods proposed prior to this paper, these parameters are chosen as fixed after some heuristic tuning and, furthermore, some additional dynamics or heuristics are proposed in order to escape from local minima. In mathematical terms, assuming that φ : R n → R, the real valued C 1 function to be minimized, has a unique finite global minimum at x ∗ ∈ R n , the HBF method follows, All authors are with the Department of Electrical Engineering (PEE/COPPE), Federal University of Rio de Janeiro (UFRJ), P.O.Box 68504, Rio de Janeiro, 21945-970, Brazil (phone: +55 21 2562 8078, 8081, 8076; email: amit@nacad.ufrj.br, quini@ort.org.br, euge- nius@nacad.ufrj.br). hopefully to the global minimum, the trajectories of the second-order differential equation: ¨ x + γ ˙ x + ∇φ(x)= 0 (1) where γ is a positive scalar parameter. The Heavy Ball idea, without friction, was taken up by Snyman and Fatti [3], who reevaluated the method in [4]. Since they used inertial frictionless (gradient) descent, they proposed random initialization from multiple starting points, as well as a heuristic technique to modify the trajectories in a manner that ensures, in the case of multiple local minima, a higher probability of convergence to a lower local minimum than would have been achieved had conventional gradient local search methods been used. Shimizu et al. [5] used the HBF ODE, but proposed an additional “attraction-repulsion” spring-like term to modify trajectories and introduce chaotic dynamics to favor convergence to the global minimum. Finally, Attouch et al. [6], [7], [8] studied the HBF method in depth and proposed the “Dynamical Inertial Newton (DIN)” method, which proposes to use a Hessian term added to the friction parameter. Other methods that are based on the continuous-time steepest descent and use other local minima escape strategies such as “subenergy tunneling” and “terminal repellers” [9], [10], [11] are less directly related to the HBF method, thus for lack of space, they will not be discussed here. Neural networks used for optimization are typically imple- mentations of trajectory-following methods, represented by ordinary differential equations (ODE) and there has been a revival of interest in analog or ordinary differential equation based methods for computation [12]. The same is true of methods to solve different optimization problems, based on the control Liapunov function (CLF) method and detailed in the present context in the book [13]. More specifically, preliminary and partial versions of this paper [14], [15] presented several second-order neural nets for minimization of nonconvex functions, interpreted and designed as closed- loop control system using CLFs. This paper proposes a new family of controlled conjugate gradient type (CCG) nets that emerge as the most successful of the class of HBF-type nets. Indeed, the new family of CCG nets may be thought of as a generalization of the class of HBF nets in which the constant parameters (γ and the coefficient 1 of the term in ∇φ in (1)) of the latter are allowed to be state-dependent, i.e., are determined by state feedback. WCCI 2010 IEEE World Congress on Computational Intelligence July, 18-23, 2010 - CCIB, Barcelona, Spain IJCNN 978-1-4244-8126-2/10/$26.00 c 2010 IEEE 3927