1
Learning to Reproduce Fluctuating Behavioral
Sequences Using a Dynamic Neural Network Model
with Time-Varying Variance Estimation Mechanism
Shingo Murata
1
, Jun Namikawa
2
, Hiroaki Arie
1
, Jun Tani
3
, and Shigeki Sugano
1
1
Department of Modern Mechanical Engineering, Waseda University, Tokyo, Japan
2
Brain Science Institute, RIKEN, Saitama, Japan
3
Department of Electrical Engineering, Korea Advanced Institute of Science and Technology,
Daejeon, Republic of Korea
Abstract—This study shows that a novel type of recurrent
neural network model can learn to reproduce fluctuating training
sequences by inferring their stochastic structures. The network
learns to predict not only the mean of the next input state, but
also its time-varying variance. The network is trained through
maximum likelihood estimation by utilizing the gradient descent
method, and the likelihood function is expressed as a function of
both the predicted mean and variance. In a numerical experiment,
in order to evaluate the performance of the model, we first tested
its ability to reproduce fluctuating training sequences generated
by a known dynamical system that were perturbed by Gaussian
noise with state-dependent variance. Our analysis showed that
the network can reproduce the sequences by predicting the
variance correctly. Furthermore, the other experiment showed
that a humanoid robot equipped with the network can learn
to reproduce fluctuating tutoring sequences by inferring latent
stochastic structures hidden in the sequences.
I. I NTRODUCTION
The ability to learn to predict perceptual outcomes of
intended actions has been considered to be essential for the
developmental learning of actions in both infants [1] and
artificial agents [2], [3]. Meanwhile, time-developments in our
everyday life are not always predictable, but often varied or
stochastic. For example, in the process of skill acquisition
through imitation learning, because perceptual experiences
are noisy and slightly different every time, learners need to
extract common information and its fluctuation level from the
experiences.
Recurrent neural networks (RNNs) have been intensively
investigated for their suitability for prediction by learning
[4]–[6]. In the context of behavior learning for robots, Tani
and colleagues have shown that RNN-based models can learn
to predict perceptual consequences of actions in navigation
problems [7], as well as to predict perceptual sequences for sets
of action intentions in object manipulation tasks [8], [9]. RNN-
based models, however, are considerably limited due to the
deterministic nature of their prediction mechanism. As deter-
ministic dynamical systems, RNNs cannot learn to reproduce
stochastic structures hidden in noisy temporal sequence data
used for training. If RNNs are forced to learn such temporal
sequence data, the learning process tends to become unstable
with the accumulation of errors.
To address this problem, Namikawa and colleagues recently
proposed a novel continuous-time RNN (CTRNN) model that
can learn to predict not only the next mean state, but also
the variance of the observable variables at each time step [10].
The predicted variance functions as an inverse weighting factor
for the prediction error that is back-propagated in the process
of learning. The formulation of the model is analogous to
the free energy minimization principle proposed by Friston
[11], [12], in which learning, generation, and recognition of
stochastic sequences are formulated by means of likelihood
maximization.
This study shows that a novel CTRNN referred to as stochas-
tic CTRNN (S-CTRNN) can learn to reproduce fluctuating
training sequences generated by a dynamical system by infer-
ring their stochastic structures. Furthermore, we describe how
the S-CTRNN can be successfully applied in robot learning
problems dealing with fluctuating behavioral sequences by
conducting an experiment on sensory-guided robot behavior
demonstrated to a robot by a human trainer.
Different approaches for estimating and utilizing variance
for robot behavior learning have been proposed, including
combinations of Gaussian mixture model (GMM) and Gaus-
sian mixture regression (GMR) [13], [14]. We consider that the
implementation of a method for the estimation or prediction
of variance in the CTRNN model can be used as an alternative
to the previously proposed approaches.
The next section presents details about the forward dynam-
ics, training, and generation method of the S-CTRNN.
II. NEURAL NETWORK MODEL
A. Overview
The S-CTRNN makes use of a novel feature called “variance
prediction units” allocated in the output layer. By utilizing
these units, the network predicts not only the mean of the
next input, but also its variance. In this method, the mean
and the variance can be obtained by means of maximizing
the likelihood function for the sequence data. Furthermore,
upon achieving convergence of the likelihood, the network can
2013 The Third IEEE International Conference on Development and Learning and on Epigenetic Robotics
978-1-4799-1036-6/13/$31.00 ©2013 IEEE