Behavioural Processes 56 (2001) 121 – 129 Behaviorist stochastic modeling of instrumental learning Kjell Hausken a, *, John F. Moxnes b a School of Economics, Culture and Social Sciences, Uniersity of Staanger, PO Box 2557 Ullandhaug, N-4091 Staanger, Norway b Diision for Protection and Materiel, Norwegian Defence Research Establishment, PO Box 25, 2007 Kjeller, Norway Received 25 January 2001; received in revised form 19 August 2001; accepted 23 August 2001 Abstract A mathematical model is presented descriptive of instrumental learning, i.e. operant conditioning. An agent learns to commit a certain number of acts per time unit, distributed as a non-stationary Poisson process. The derivative of the agent’s expected utility per time unit, where utility is expected beneﬁt minus expected cost, is interpreted as his drive to reach a local maximum of his expected utility. This drive multiplied with his act intensity are proportional to the change of the agent’s act intensity per time unit, which is an ordinary ﬁrst order differential equation for instrumental learning. © 2001 Published by Elsevier Science B.V. Keywords: Differential equations; Drive; Empirics; Learning; Poisson; Utility www.elsevier.com/locate/behavproc 1. Introduction Dating back to before Skinner (1938), a consid- erable amount of experiments have been carried out to test instrumental learning, also called oper- ant conditioning. Examples of early experiments are by Krech (1935) and Yoshioka (1929). Theo- ries and experiments of learning and adjustment, goal-directed behavior, motivation, and drive the- ory, gradually worked their way into standard text books of social psychology as reported by, e.g. Dollard et al. (1939); Gleitman (1987); Krech and Crutchﬁeld (1968); Mazur (1998); Newcomb et al. (1965); Sechrest and Wallace (1967); and Woodworth and Schlosberg (1965). More recent research has been carried out by Aoyama (1998); Cannon and McSweeney (1998); Donahoe (1997); Dragoi (1997); Dragoi and Staddon (1999); Engel (1993); Fetterman et al. (1998); Killeen (1994a,b, 1995, 1999); Killeen and Amsel (1987); Killeen and Bizo (1998); Killeen and Fetterman (1988, 1993); Killeen and Weiss (1987); Machado (1997); and McSweeney and Roll (1998). Parts of this literature is quantitative and technical in nature, with suggested algebraic expressions describing the learning process. More explicit mathematical work on adaptive learning accounting for cogni- tive mechanisms and involving repeated decision tasks, reinforcement, and strategic changes have recently been proposed by Camerer and Ho (1999) and Erev and Roth (1999). Although these * Corresponding author. Tel.: +47-51-831-632/500; fax: + 47-51-831-550. E-mail addresses: kjell.hausken@oks.his.no (K. Hausken), john-f.moxnes@fﬁ.no (J.F. Moxnes). 0376-6357/01/$ - see front matter © 2001 Published by Elsevier Science B.V. PII:S0376-6357(01)00192-9