1997 IEEE Intemational Symposium on Circuits and Systems, June 9-12,1997, ~- Hong Kong New Simulated Annealing Algorithms Paul0 R. S. Mendonqa’ mendonca@coe.ufrj .br Abstract - This paper introduces a new class of D- dimensional density probability functions to be used in Simulated Annealing algorithms and derives an appropri- ate cooling schedule that is proved to be inversely propor- tional to a previously chosen power n of time. This gener- ates a new algorithm, the nFast Simulated Annealing (nFSA), from which the Fast Simulated Annealing (FSA) is a particular case. As it will be shown, this new algo- ritlhm achieves results with an accuracy that increases wilh n, at the expenses of an initial convergence speed that decreases with n. This drawback is solved by the use of an adaptive algorithm, the Adaptive nFast Simulated Annealing (AnFSA), where the parameter n starts at small value, producing a fast initial convergence, and is raised as the algorithm runs, finding global minima points quickly and with great accuracy. I. INTRODUCTION -’HE drawbacks of gradient based optimization 1 - methods are well known. These drawbacks become more evident when complex and/or large dimension systems must be optimized, e.g. when training a neural network or when operating a Hopfield neural network. In these cases the algorithm is frequently trapped in local minima, and more sophisticated methods are re- quired to escape from it. An important example of such methods is Simulated Annealing, that makes controlled use of randomness to jump out from these minima val- leys. The late results on optimization by Simulated An- ne,aling [ 11, in particular the increasingly fast cooling schemes obtained, have deserved the attention of physicists and engineers. Simulated Annealing algo- rithms belong to the same class of methods as Neural Networks [2] and Genetic Algorithms [3], in the sense that they attempt to simulate the methods that Nature uses to solve a difficult problem. In the case of Simu- lated Annealing, the analogy is with the growth of a single cristal from a molten metal, that corresponds to find the global minimum of the metal internal energy, as a function of the atoms arrangement. It is known from Metallurgy that if the metal is cooled in an appro- prjiate manner the single cristal can be built [4]. ‘ Paul0 R. S. Mendonpa and Luiz. P. Cal8ba are with COPPE - EE - UFRJ, Rio de Janeiro, RJ, Brazil, CP 68504, CEP 21945 - 970. Correspondence author. 2 Luiz P. Cal6ba’>’ caloba@coe.ufrj .br A desired property of a Simulated Annealing algo- rithm is a fast cooling scheme, i.e., a fast average con- vergence to the global minimum. The global minimum is reached if the cooling scheme is slow enough in order to guarantee that each possible state of the system is visited infinite often in time (iot) [5]. This means that the algorithm must degenerate to random search is a necessary condition to convergence, but a good algo- rithm must still take advantage of local information available, i. e., the cost function value and its deriva- tives of any order. All these features can be found in Simulated Annealing algorithms, making them a very attractiv’etool to solve multimodal optimization prob- lems. 11. CONVERGENCE SPEED OF THE METROPOLIS AL- GORITHM Although the Metropolis Algorithm 161 has been widely used as a basis for Simulated Annealing algo- rithms, few have been said about its convergence speed. The proposal of this section is to fill this gap. A. The Metropolis Algorithm Let us define a state as any possible configuration of a system. A state is said visited at instant t if the sys- tem assumes, at instant t , the configuration corre- sponding to this state. Let x, and E(x,) , respectively, the state visited at instant t in a D-dimensional set of possible states and the Cost Function or Energy Func- tion of the system associated to state x,. So it can be defined ‘theTransition Acceptance Probability given by 0-780:3-3583-X/97 $10.00 01997 IEEE 1668 where AE(x,) = E(x,+,) - E(x,) . In [ti] it was proved that if it is considered a large ensemble of systems and all states in each system can be visited id temperature T, the ensemble of states as- sumed by the systems converges to a Boltzmann-Gibbs distribution, i. e., Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE MINAS GERAIS. Downloaded on April 25, 2009 at 12:04 from IEEE Xplore. Restrictions apply.