10.1117/2.1201607.006632 A novel convolutional neural network for deep-learning classiﬁcation Jared Shamwell, Hyungtae Lee, Heesung Kwon, Amar R. Marathe, Vernon Lawhern, and William Nothwang Preliminary results from a single-trial rapid serial visual representa- tion task demonstrate the potential for enabling generalized human- autonomy sensor fusion across multiple subjects. Brain–computer interfaces (BCIs) have traditionally been used to enable communication and control for paralyzed patients. 1 However, it is also thought that BCIs hold promise for fulﬁlling the longstanding goal of creating artiﬁcial systems (i.e., which can perform with the adaptability, robustness, and general intel- ligence of humans). To augment the sensing and processing ca- pabilities of such artiﬁcial systems, BCI systems can thus be used on healthy individuals. In this way, the biological machinery that enables human cognition can be leveraged. Image triage—a vi- sual target search over a set of images—is a prime application for this new class of BCI. Humans can effortlessly identify target objects in scenes that stymie even the best machine vision tech- niques. Manual inspection by humans, however, is limited by the speed at which targets can be consciously detected and re- ported by a behavioral response. For example, when targets are identiﬁed by pressing a button, the button is typically pressed two to ﬁve images after the target image is shown (when the im- age stream is presented at 5Hz). This forces the observer to as- sume a distribution of images for the several images that precede the button press. 2 In addition, humans perform inconsistently because of exogenous distractions or endogenous factors (e.g., fatigue), whereas computer vision algorithms offer constant and predictable performance. As an alternative, machine learning approaches can be ap- plied to raw human neurophysiological data and thus reveal signals that are relevant to the detection of target images. Ulti- mately, this can increase both the accuracy and the response rate of image triage classiﬁcation tasks. Indeed, in recent work, it has been shown that the classiﬁcation performance in a rapid serial Figure 1. Illustrating the display of images during a rapid serial visual presentation (RSVP) task. In this case, the images are presented at 5Hz and the subjects are required to indicate (via a behavioral response) when an—infrequently occurring—target image is shown. visual presentation (RSVP) image triage task (see Figure 1) can be improved by combining human neurophysiological data with machine vision classiﬁers. 2 To date, such methods have relied on the late fusion of human and machine-generated classiﬁer out- puts. In other words, the classiﬁers for image and human data are trained separately and their outputs are later fused. It may be possible to improve the classiﬁcation performance even fur- ther if the complementary information carried in the human sig- nals and the image data can be trained in tandem. To realize this aim, however, relevant neurophysiological data (which carries a discriminatory signal) and the ability to process and convert these signals to useful task determinants is required. In addition, it is necessary to have a common framework, within which it is Continued on next page