Fusion of Motion Information with Static Classifications of Occupant Images for Smart Airbag Applications Michael Farmer Computer Science, Engineering Science and Physics University of Michigan – Flint Flint, MI, U.S.A. farmerme@umflint.edu Jason Rieman Computer Science, Engineering Science and Physics University of Michigan – Flint Flint, MI, U.S.A. reimanj@umflint.edu Abstract - In many real-time object recognition applications, the system experiences conditions where the classification results are not reliable due to a variety of environmental or object poses where correct classification is difficult or even impossible. We propose that by tracking the motion and orientation of the object of interest, and fusing this information with the classification results, we can greatly improve the classifier performance. This can be achieved firstly by estimating the reliability of the classification, and secondly by using track state estimates to derive additional classification cues. We develop a framework based on Interacting Multiple Model (IMM) Kalman Filtering, Dempster-Shafer evidential reasoning and fuzzy set memberships, for integrating the track and classification information from an incoming video image stream. We demonstrate the performance of our proposed framework in the application of a real-time vision system for smart automotive airbags. We show that our fusion approach improves the final performance to 100% correct classification, the level required for a robust safety system. Keywords: Image classification, Kalman filtering, Dempster-Shafer, fuzzy set membership. 1 Introduction Image classification methods generally operate on a single image. Many real-time applications, however, collect and classify a sequence of images over time [2][3]. During operation, the system has the opportunity to collect many images in conditions that provide reliable classification results. Likewise, during these same periods of operation, there are times when the system may experience a variety of situations that may make the correct classification difficult or even impossible. These situations may be due to external environmental conditions (e.g. camera saturation due to bright sunlight) or they may be due to target motion (e.g. a person pulling a sweater over their head). In these difficult to classify cases, where there may be no reliable information from the classification sensor, the system should be able to declare ‘ignorance’ regarding the object classification. Additionally, while the orientation of the object of interest may not be favorable for correct classification, the track information regarding the motion and orientation of the object of interest provides an additional source of information. The aim of this paper is to present a robust processing framework for fusing a temporal stream of classification results with additional class-related information derived from a tracker based on three technologies: (i) Interacting Multiple Model (IMM) Kalman Filtering, (ii) fuzzy set membership, and (iii) Dempster-Shafer evidential reasoning methods. The datasets we used for our experiments are from a real- world proto-type vision-based airbag suppression system. 2 Automotive Smart Airbag Application The integration of airbags into passenger vehicles during the 1980’s and 1990’s has been particularly effective in reducing the number of highway fatalities in the United States. Unfortunately, airbags were designed to protect the worst-case scenarios, namely a 95 th percentile adult male during a 30 mph crash, which makes them potentially dangerous for smaller occupants [6]. Consequently, between 1986 and 2001, 19 infants and 85 children were killed, where it would have been safer if the airbag had been disabled [6]. Consequently, considerable attention has recently been paid to developing “smart” airbags that can determine if they should be deployed in a crash event, depending on the type of occupant, particularly vision systems [12][13][14]. In our system the occupant is monitored using a real-time monocular computer vision system which is mounted in the roof-liner in the location identified in Figure 1. We chose a monocular vision approach over a stereo vision or multi-sensor approach to develop the lowest possible cost system [11][12][13][14]. There are five operating modes of the system: (i) disable the airbag for infants, (ii) enable the airbag with a low power deployment for properly seated children, (iii) disable the airbag for adults or children that are too close to the airbag, (iv) enable the airbag with full power for