Abstract. A computational model of a learning system (LS) is described that acquires knowledge and skill necessary for optimal control of a multisegmental limb dynamics (controlled object or CO), starting from ‘‘knowing’’ only the dimensionality of the object’s state space. It is based on an optimal control problem setup different from that of reinforcement learning. The LS solves the optimal control problem online while practicing the manipulation of CO. The system’s functional architecture comprises several adaptive components, each of which incorporates a number of mapping functions approximated based on artificial neural nets. Besides the internal model of the CO’s dynamics and adaptive controller that computes the control law, the LS includes a new type of internal model, the minimal cost (IM mc Þ of moving the controlled object between a pair of states. That internal model appears critical for the LS’s capacity to develop an optimal movement trajectory. The IM mc interacts with the adaptive controller in a cooperative manner. The controller provides an initial approxima- tion of an optimal control action, which is further optimized in real time based on the IM mc . The IM mc in turn provides information for updating the controller. The LS’s performance was tested on the task of center-out reaching to eight randomly selected targets with a 2DOF limb model. The LS reached an optimal level of performance in a few tens of trials. It also quickly adapted to movement perturbations produced by two different types of external force field. The results suggest that the proposed design of a self- optimized control system can serve as a basis for the modeling of motor learning that includes the forma- tion and adaptive modification of the plan of a goal- directed movement. 1 Introduction An efficient mechanism for movement control adapta- tion to changes in the external environment is extremely important for the survival of any biological organism or an artificial autonomous entity. The main goal of this study is to establish a general functional architecture of a multipurpose adaptive motor control system that is capable of learning to control a limblike mechanical object and adapting to changes in its dynamics. Such a model is required for understanding the functional mechanisms of neurobiological motor control and motor learning systems. It also is necessary for attaining a qualitatively new level of control in neuroprosthetic devices used for assisting completely or partially para- lyzed patients (one of the primary focuses of our laboratory). Third, such a system would be extremely useful for constructing robots, especially autonomous machines required to function unmanned for long time periods. The reaching movement has been selected as the motor task because it has been extensively investigated in the past several decades, and, therefore, a large pool of experimental data is available for a comparison with modeling results. 1.1 Known approaches to modeling the learning of goal-directed movements Most approaches to creating models of a control system capable of learning to perform goal-directed limb movements are based on describing the learning math- ematically as the process of the motor control system’s self-optimization with regard to a certain criterion. These approaches can be divided into the following three general categories. Inverse-dynamics-based approach. The first category, which currently is oriented toward neurophysiology of movement control the most, includes those methods in which a desired movement trajectory is considered to be given in the form of a ‘‘black box’’ out of which a temporal series of trajectory points is streaming (e.g., Correspondence to: Y. P. Shimansky (e-mail: yury.shimansky@asu.edu) Biol. Cybern. 90, 133–145 (2004) DOI 10.1007/s00422-003-0452-4 Ó Springer-Verlag 2004 A novel model of motor learning capable of developing an optimal movement control law online from scratch Yury P. Shimansky 1 , Tao Kang 1 , Jiping He 1;2 Harrington Department of Bioengineering, Arizona Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA Control Engineering, Huazhong University of Science and Technology, Wuhan, China Received: 15 May 2002 / Accepted: 3 November 2003 / Published online: 27 January 2004