2003 Special issue Autonomous mental development in high dimensional context and action spaces Ameet Joshi a, * , Juyang Weng b a Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA b Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA Abstract Autonomous Mental Development (AMD) of robots opened a new paradigm for developing machine intelligence, using neural network type of techniques and it fundamentally changed the way an intelligent machine is developed from manual to autonomous. The work presented here is a part of SAIL (Self-Organizing Autonomous Incremental Learner) project which deals with autonomous development of humanoid robot with vision, audition, manipulation and locomotion. The major issue addressed here is the challenge of high dimensional action space (5– 10) in addition to the high dimensional context space (hundreds to thousands and beyond), typically required by an AMD machine. This is the first work that studies a high dimensional (numeric) action space in conjunction with a high dimensional perception (context state) space, under the AMD mode. Two new learning algorithms, Direct Update on Direction Cosines (DUDC) and High- Dimensional Conjugate Gradient Search (HCGS), are developed, implemented and tested. The convergence properties of both the algorithms and their targeted applications are discussed. Autonomous learning of speech production under reinforcement learning is studied as an example. q 2003 Elsevier Science Ltd. All rights reserved. Keywords: Autonomous mental development; High dimensional; Robotic system 1. Introduction and problem identification Autonomous Mental Development (AMD) (Weng et al., 2001) is a new paradigm for developing autonomous machine. The machine is controlled by a new kind of program called developmental program ever since its birth. Although develop-mental program is different from a traditional program in many ways, the most fundamental difference is that the programmer does not know the tasks that the robot ends up learning after birth. Therefore a developmental program must be able to generate internal representation on the fly for virtually any task. The capability of the machine is developed through real time interactions with the physical world. It depends on the five constraints: (1) sensor, (2) effector, (3) computational resource, (4) devel- opmental program and (5) the way the robot is taught. The other challenges of an AMD robot include 1. Environmental openness. 2. High-dimensional sensors. 3. Completeness in using sensory information. 4. Online processing. 5. Real-time speed. 6. Incremental processing. 7. Perform while learning. 8. Muddy tasks. There have been studies in humanoid control that involve high dimensional action space e.g. (Vijayakumar & Schaal, 2000; Billard and Mataric, 2000). These studies include strictly action space, without using the perception space. In other words, the robot is able to learn only one action trajectory, but is not able to produce different action trajectories under different contexts. Further, the studies in Vijayakumar and Schaal (2000) and Billard and Mataric (2000) are based on supervised learning. The explosion of both perception and action spaces creates a very practical but unaddressed challenging research issue. Compounding the challenge is the issue of learning modes. For such a high dimensional action space, supervised learning is often not practical, especially for internal actions (actions that are produced by internal effectors that are not reachable externally by human teacher). For example, an external action performed by an arm can be taught in a supervised 0893-6080/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved. doi:10.1016/S0893-6080(03)00134-5 Neural Networks 16 (2003) 701–710 www.elsevier.com/locate/neunet * Corresponding author. E-mail addresses: joshiame@egr.msu.edu (A. Joshi), weng@cse.msu. edu (J. Weng).