Application of artificial neural network to fMRI regression analysis Masaya Misaki * and Satoru Miyauchi Brain Information Group, Kansai Advanced Research Center, National Institute of Information and Communications Technology, 588-2 Iwaoka, Iwaoka-cho, Nishi-ku, Kobe-shi, Hyogo 651-2429, Japan Received 18 January 2005; revised 29 July 2005; accepted 1 August 2005 Available online 6 September 2005 We used an artificial neural network (ANN) to detect correlations between event sequences and fMRI (functional magnetic resonance imaging) signals. The layered feed-forward neural network, given a series of events as inputs and the fMRI signal as a supervised signal, performed a non-linear regression analysis. This type of ANN is capable of approximating any continuous function, and thus this analysis method can detect any fMRI signals that correlated with corresponding events. Because of the flexible nature of ANNs, fitting to autocorrelation noise is a problem in fMRI analyses. We avoided this problem by using cross-validation and an early stopping procedure. The results showed that the ANN could detect various responses with different time courses. The simulation analysis also indicated an additional advantage of ANN over non-parametric methods in detecting parametrically modulated responses, i.e., it can detect various types of parametric modulations without a priori assumptions. The ANN regression analysis is therefore beneficial for exploratory fMRI analyses in detecting continuous changes in responses modulated by changes in input values. D 2005 Elsevier Inc. All rights reserved. Keywords: Artificial neural network; fMRI; Semi-parametric analysis; Autocorrelation noise; Paramedic modulation Introduction In functional magnetic resonance imaging (fMRI) studies, searches for the occurrence of signal changes that correlated with certain events are conducted to detect brain activations. In the general linear model (GLM) approach (Friston et al., 1995), a regressor representing the canonical hemodynamic response function (HRF) is used to detect such correlations. However, if the shape of the hemodynamic response differs greatly from the pre-assumed shape (Aguirre et al., 1998; Miezin et al., 2000) or an unknown process mediates such correlations, we cannot detect those correlations. In some cases, a combination of regressors, the canonical HRF, its temporal derivative, and a dispersion derivative, is used to absorb this diversity of response shape (Friston et al., 1998a,b). Although this method can absorb minor changes in the canonical HRF, it cannot absorb all diversity and the response variability is still a problem. Various other approaches, which do not assume the shape of the HRF a priori, have been proposed including selective averaging (Dale and Buckner, 1997), smooth FIR filters (Goutte et al., 2000), and non-parametric Bayesian estimation of the HRF (Marrelec et al., 2003). The primary advantage of these methodologies, which are called non-parametric methods because no parametric models of the HRF are used, is that even when the response functions diverge from region to region or subject to subject, correlated responses can be detected. In this study, we proposed another Fsemi-parametric_ method for fMRI analysis: the use of a very general class of functional forms to build more flexible models (Bishop, 1995). The proposed method uses a feed-forward layered artificial neural network (ANN) to describe a non-linear dynamic system of hemodynamic response. Some event over its recent history is used as an input and the BOLD signal from a particular voxel is used as a supervised signal. This type of artificial neural network is called a multi-layer perceptron (MLP). It is known that for an infinite number of hidden units, an MLP with one or more hidden layers whose output functions are sigmoid functions can approximate any continuous function to any degree of accuracy (Funahashi, 1989; Hornik et al., 1989). Thus this method can perform a non-linear regression analysis between BOLD signals and events without explicit modeling of the response function. In the MLP, hidden units receive input vector x in multiplied by an adjustable weight matrix W hidden-in and bias value b hidden . Hidden units have a general class of transfer function, sigmoid function tanh, and transfer the input value to the output layer. x hidden ¼ tanh W hiddenin x in þ b hidden ð Þ ð1Þ At the output layer, the outputs of the hidden layer x hidden are multiplied by an adjustable weight matrix W out-hidden and a bias value b out is added. y ¼ W outhidden x hidden þ b out ð2Þ 1053-8119/$ - see front matter D 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.neuroimage.2005.08.002 www.elsevier.com/locate/ynimg NeuroImage 29 (2006) 396 – 408 * Corresponding author. Fax: +81 78 969 2279. E-mail address: misaki@po.nict.go.jp (M. Misaki). Available online on ScienceDirect (www.sciencedirect.com).