Behavioural Processes 56 (2001) 121 – 129
Behaviorist stochastic modeling of instrumental learning
Kjell Hausken
a,
*, John F. Moxnes
b
a
School of Economics, Culture and Social Sciences, Uniersity of Staanger, PO Box 2557 Ullandhaug,
N-4091 Staanger, Norway
b
Diision for Protection and Materiel, Norwegian Defence Research Establishment, PO Box 25, 2007 Kjeller, Norway
Received 25 January 2001; received in revised form 19 August 2001; accepted 23 August 2001
Abstract
A mathematical model is presented descriptive of instrumental learning, i.e. operant conditioning. An agent learns
to commit a certain number of acts per time unit, distributed as a non-stationary Poisson process. The derivative of
the agent’s expected utility per time unit, where utility is expected benefit minus expected cost, is interpreted as his
drive to reach a local maximum of his expected utility. This drive multiplied with his act intensity are proportional
to the change of the agent’s act intensity per time unit, which is an ordinary first order differential equation for
instrumental learning. © 2001 Published by Elsevier Science B.V.
Keywords: Differential equations; Drive; Empirics; Learning; Poisson; Utility
www.elsevier.com/locate/behavproc
1. Introduction
Dating back to before Skinner (1938), a consid-
erable amount of experiments have been carried
out to test instrumental learning, also called oper-
ant conditioning. Examples of early experiments
are by Krech (1935) and Yoshioka (1929). Theo-
ries and experiments of learning and adjustment,
goal-directed behavior, motivation, and drive the-
ory, gradually worked their way into standard
text books of social psychology as reported by,
e.g. Dollard et al. (1939); Gleitman (1987); Krech
and Crutchfield (1968); Mazur (1998); Newcomb
et al. (1965); Sechrest and Wallace (1967); and
Woodworth and Schlosberg (1965). More recent
research has been carried out by Aoyama (1998);
Cannon and McSweeney (1998); Donahoe (1997);
Dragoi (1997); Dragoi and Staddon (1999); Engel
(1993); Fetterman et al. (1998); Killeen (1994a,b,
1995, 1999); Killeen and Amsel (1987); Killeen
and Bizo (1998); Killeen and Fetterman (1988,
1993); Killeen and Weiss (1987); Machado (1997);
and McSweeney and Roll (1998). Parts of this
literature is quantitative and technical in nature,
with suggested algebraic expressions describing
the learning process. More explicit mathematical
work on adaptive learning accounting for cogni-
tive mechanisms and involving repeated decision
tasks, reinforcement, and strategic changes have
recently been proposed by Camerer and Ho
(1999) and Erev and Roth (1999). Although these
* Corresponding author. Tel.: +47-51-831-632/500; fax: +
47-51-831-550.
E-mail addresses: kjell.hausken@oks.his.no (K. Hausken),
john-f.moxnes@ffi.no (J.F. Moxnes).
0376-6357/01/$ - see front matter © 2001 Published by Elsevier Science B.V.
PII:S0376-6357(01)00192-9