arXiv:1105.2790v3 [cond-mat.dis-nn] 10 Jan 2012 On the equivalence of Hopfield Networks and Boltzmann Machines Adriano Barra * , Alberto Bernacchia † , Enrica Santucci ‡ and Pierluigi Contucci § January 2012 Abstract A specific type of neural network, the Restricted Boltzmann Machine (RBM), is implemented for classification and feature detection in machine learning. RBM is characterized by separate layers of visible and hidden units, which are able to learn efficiently a generative model of the observed data. We study a "hybrid" version of RBM’s, in which hidden units are analog and visible units are binary, and we show that thermodynamics of visible units are equivalent to those of a Hopfield network, in which the N visible units are the neurons and the P hidden units are the learned patterns. We apply the method of stochastic stability to derive the thermodynamics of the model, by considering a formal extension of this technique to the case of multiple sets of stored patterns, which may act as a benchmark for the study of correlated sets. Our results imply that simulating the dynamics of a Hopfield network, requiring the update of N neurons and the storage of N (N - 1)/2 synapses, can be accomplished by a hybrid Boltzmann Machine, requiring the update of N + P neurons but the storage of only NP synapses. In addition, the well known glass transition of the Hopfield network has a counterpart in the Boltzmann Machine: It corresponds to an optimum criterion for selecting the relative sizes of the hidden and visible layers, resolving the trade-off between flexibility and generality of the model. The low storage phase of the Hopfield model corresponds to few hidden units and hence a overly constrained RBM, while the spin-glass phase (too many hidden units) corresponds to unconstrained RBM prone to overfitting of the observed data. 1 Introduction A common goal in Machine Learning is to design a device able to reproduce a given system, namely to estimate the probability distribution of its possible states [15]. When a satisfactory model of the system is not available, and its underlying principles are not known, this goal can be achieved by the observation of a large number of samples [11]. A well studied example is the visual world, the problem of estimating the probability of all possible visual stimuli [23]. A fundamental ability for the survival of living organisms is to predict which stimuli will be encountered and which are more or less likely to occur. On this purpose, the brain is believed to develop an internal model of the visual world, to estimate the probability and respond to the occurrence of various events [6],[7]. Ising-type neural networks have been widely used as generative models of simple systems [16],[3]. Those models update the synaptic weights between neurons according to a specific learning rule, depend- ing on the neural activity driven by a given set of observations; after learning, the network is able to generate a sequence of states whose probabilities match those of the observations. Popular examples of Ising models, characterized by a quadratic energy function and a Boltzmann distribution of states, are * Dipartimento di Fisica, Sapienza Università di Roma. † Department of Neurobiology, Yale University. ‡ Dipartimento di Matematica, Università degli Studi dell’Aquila. § Dipartimento di Matematica, Alma Mater Studiorum Università di Bologna. 1