1216 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. zyxwvuts 20, NO. zyxwvu 5, SEPTEMBER/OCTOBER 1990 zyx environment. The mechanism that we have presented can be defined as an action probability updating rule and thus from the viewpoint of perspective theoretician, it is a VSSA. The machine is essentially a stubborn machine. In other words, once the machine has chosen a particular action it increases the probabil- ity of choosing the action irrespective of whether the response from the environment was favorable or unfavorable. However this increase in the action probability is done in a systematic and methodical way so that the machine learns the best action which the environment offers in an €-optimal fashion. The mechanism that we have presented forms an excellent model for an e-opti- mal stubbornly learning system. Apart from the fact that the machine is shown to be E-opti- mal, a major contribution of this paper is that the mathematical tools used in this proof (namely the theory of distributions, kernels and topological spaces) are quite distinct from those that are currently used in the field of learning. Besides the above theoretical results, the paper also contains various simulation results that demonstrate the properties of the mechanism pre- sented and that compares it with the traditional L,, scheme. We are currently investigating the properties of multiaction stubbornly learning mechanisms. We are also currntly working on generalizing the results presented in this paper to yield a more all-encompassing method by which we can prove that the equilibrium probability of any linear updating scheme (ergodic or absorbing) converges to a probability measure with masses concentrated on at most two points, which are the zeroes of a second order polynomial. ACKNOWLEDGMENT We would like to thank the Valivetis for preparing the manuscript for us. We are also very grateful to an anonymous referee who critically reviewed the original paper and recom- mended suggestions which drastically improved the quality of the paper. We are, above all, grateful to Prof. Lakshmivarahan who carefully proofread the original paper and critically prere- viewed it. REFERENCES R. C. Atkinson, G. H. Bower, and E. J. Crothers, zyxwvutsrqpon An Introduction fo Mathematical Learning Theory. H. Aso and M. Kimura, “Absolute expediency of learning automata,” Inform. Sci., vol. 17, no. 2, pp. 91-112, 1979. R. R. Bush and F. Mosteller, Stochastic Models for Learning. New Y o r k Wiley, 1958. zyxwvutsrqpon 1. C. Campione, “The performance of preschool children on reversal and zyxwvutsrqponm two types of extradimensional shifts,” J. Exp. Child Psych., vol. 11, pp. 480-490, 1971. D. D. Dorfman and M. Biderman, “A learning model for a continuum of sensory states,” J. Math. Psych., vol. 8, pp. 264-285, 1971. W. K. Estes and J. H. Straughan, “Analysis of a verbal conditioning situation in terms of statistical learning theory,” J. Exp. Psych., vol. 47, pp. 225-234, 1954. M. P. Friedman, C. J. Burke, M. Cole, L. Keller, R. B. Millward, and W. K. Estes, “Two-choice behavior under extended training with shift- ing probabilities of reinforcement,” Studies in MafhemaficalPsychology, R. C. Atkinson, Ed. Stanford, CA: Stanford Univ. Press, 1964, pp. 250-316. K. S. Fu, “Learning control systems-Review and outlook,” IEEE Trans. Automatic Conrr., vol. AC-15, pp. 210-221, 1970. M. losifescu and R. Theodorescu, Random Processes and Learning. New York, Springer, 1969. D. H. Krantz, R. C. Atkinson, R. D. Luce, and P. Suppes, Eds., Contemporary Decdopments in MafhemaficalPsychology, vol. I. Free- man, San Francisco, 1974. S. Lakshmivarahan, Learning Algorithms Theory and Applications. New York: Springer-Verlag, 1981. S. Lakshmivarahan and M. A. L. Thathachar, “Absolutely expedient algorithms for stochastic automata,” IEEE Trans. Sysr. Man Cybern., vol. SMC-3, pp. 281-286, 1973. E. Lovejoy, “Analysis of the overlearning reversal effect,” Psych. Rei,., vol. 73, pp. 87-103, 1966. R. D. Luce, Indiiidual Choice Behaiior. New York: Wiley, 1965. New York: Wiley, 1959. J. M. Mendel and K. S. Fu, Eds., Adapfiiv Learning and Paftern Recognition Sysfems: New York: Academic, 1970. -, “Reinforcement learning models and their applications to control problems,” Learning Sysfems- zyxwv A Symposium of the 1973 Joinf Automatic Confrol Confi, 1973, pp. 3-18. K. S. Narendra and M. A. L. Thathachar, Learning Aufomara. Engle- wood Cliffs, NJ: Prentice-Hall, 1989. -, “Learning automata-A survey,” IEEE Trans. Sysf. Man Cy- bern., vol. SMC-4, 1974, pp. 323-334. -, “On the behaviour of a learning automaton in a changing environment with routing applications,” IEEE Trans. Sysf. Man Cy- bern., vol. SMC-IO, pp. 262-269, 1980. J. Neveu, Discrefe Paramefer Matingales, North-Holland Mathematical Library, vol. 10. Amsterdam: North-Holland, 1975. M. F. Norman, Markoi. Processes and Learning Models. New York: Academic, 1972. -, “Slow learning,” Brifish J. Mafh. Sfafist. Psych., vol. 21, pp. 141-159, 1968. B. J. Oommen and M. A. L. Thathachar, “Multiaction learning au- tomata possessing ergodicity of the mean,” Inform. Sri., vol. 35, pp. 183-198, 1985. -, “Absorbing and ergodic discretized two-action learning au- tomata,” IEEE Trans. Syst. Man Cybern., vol. SMC-16, pp. 282-296, 1986. B. J. Oommen and D. C. Y. Ma, “Stochastic automata solutions to the object partitioning problem,” to appear in The Compufer Journal. -, “Ergodic learning automata Capable of incorporating a priori information,” IEEE Trans. Sysf. Man Cybern., vol. SMC-17, no. 4, pp. 717-723, July/Aug. 1987. A. Paz, Introduction to Probabilistic Aufomafa. New York: Academic, 1971. R. Ramesh, “Learning automata in pattern classification,” M.E. The- sis, Indian Institute of Science, Bangalore, India, 1983. P. S. Sastry, “Systems of learning automata: Estiniator algorithms applications,” Ph.D. Thesis, Dept. Elec. Eng., Indian Institute of Sci- ence, Bangalore, India, June 1985. E. A. C. Thomas, “On a class of additive learning models: Error-cor- recting and probability matching,” J. Math. Psych., vol. 10, 1973. F. Treves, Topological Vector Spaces, Distributions and Kernels. Pure and Applied Mathematics Series No. 25. New York: Academic, 1967. M. L. Tsetlin, Aufomafon Theory and the Modelling of Biological Sys- fems. New York: Academic, 1973. Y. Z. Tsypkin and A. S. Poznyak, “Finite learning automata,” Eng. Cybern., vol. 10, pp. 478-490, 1972. -, Adapfafion and Learning in Aufomafic Sysfems. New York: Academic, 1971. V. 1. Varshavskii and 1. P. Vorontsova, “On the behaviour of stochastic automata with variable structure,” Automafica Telemekhanica (USSR), vol. 24, pp. 327-333, 1963. R. Viswanathan, Learning Automafa: Models and Applications, Ph.D. thesis, Dept. Eng., Yale Univ., CT, 1972. D. Zeaman and B. J. House, “The role of attention in retardate discrimination learning,” Handbook of Menfal Deficiency, N. R. Ellis, Ed. New York: McGraw-Hill, 1963, pp. 159-223. The Rank Transformation Applied to a Multiunivariate Method of Global Optimization CARY D. PERTTUNEN AND BRUCE E. STUCKMAN, MEMBER, IEEE Abstract -The rank transformation has been employed successfully in a recently proposed nonparametric global optimization method. The nonparametric method determines the location of its next guess based on the rank-transformed objective function evaluations rather than on the actual function values themselves. The application of the rank transformation to the multiunivariate method of global optimization is shown to significantly reduce the number of function evaluations needed for convergence within a specified tolerance. Manuscript received October 27, 1989; revised April 25, 1990. This paper was presented in part at the IEEE International Conference on Systems Engineering, Dayton, OH, August 24-26, 1989. The authors are with the Department of Electrical Engineering, University of Louisville, Louisville, KY 40292. IEEE Log Number 9036785. 0018-9472/90/0900- 121 6$01 .OO zyxwvuts 0 1990 IEEE